Options to aid zero-downtime deployment / process supervision by brasic · Pull Request #132 · resque/resque-pool

brasic · 2015-10-06T16:49:16Z

This PR includes the remaining differences in ShippingEasy's resque-pool fork, both of which are optional command line flags that tweak startup behavior to support zero-downtime deployment strategies and running resque-pool inside a process supervisor.

The normal approach to deploying new worker code is to signal resque-pool to gracefully shutdown with QUIT or INT, then start up a new instance. The problem with that approach is that long-running jobs and slow app startup can lead to a dramatic reduction in worker throughput during a deploy, which in our case is unacceptable.

The technique we use to maintain full throughput during deploys is to start a second resque-pool instance with the new code which loads a rails environment and gracefully shuts down the previous instance only when its children are ready to accept work. This requires two new command line options:

--no-pidfile: Launching a second resque-pool instance in daemon mode was not possible because it detects the prior instance by its pidfile and refuses to start. This option causes the daemon to skip pidfile creation so multiple instances can run at the same time. Of course, this should only be used if the daemon is run under a supervisor like upstart or systemd that can automatically detect the pids of managed processes.
--lock FILE: Even though upstart and systemd can detect the pids of processes they manage, neither copes well with code redeploys, where an entirely new process replaces the existing one. For this we rely on a approach adapted from this post where resque-pool opens a shared filesystem lock at startup on a designated file. Deploys cause the new process to open an additional shared lock on the same file before the old process exits, meaning that as long as at least one instance is running the file will always be locked in shared mode. The process supervisor starts resque-pool then immediately attempts to open an exclusive lock on the same file. This will block until all shared locks are cleared, so the supervisor will only attempt to restart resque-pool when there are truly no instances running. In this way the shared lock file acts as surrogate pid file, except it can be shared across redeployed processes.

This approach is a little unusual but it works well for us - we are able to deploy many times per day under heavy load with zero impact on job throughput or latency. We would like to switch from our fork to the official version, so I hope these contributions will be useful to others.

joshuaflanagan · 2015-10-06T17:20:38Z

👍 This code has been running reliably for us for almost a year.

nevans · 2015-10-06T18:27:35Z

Nice, and simple too!

Would you mind adding some documentation? Just a short blurb in README.md and some example upstart (or systemd) and capistrano and whatever else is necessary to make it all work. At some point, I'd like to directly support capistrano via require "resque/pool/capistrano" and clean up the examples dir. But for today, a sentence and code snippets in README.md will do. :)
How does the new pool obtain the pid of the old pool? How do you know when the new workers are ready to accept work? Did you make a callback for that, or do you use some existing callback to trigger signaling the old master? Is it possible to package that code up as well, perhaps enabled via another command line option? E.g.: resque-pool --restart-with-shared-lock could set --no-pidfile --lock tmp/resque-pool.lock (sensible default lockfile location) and then signal the original master after startup is complete.
This could get folks in trouble if they are tightly memory constrained (since the new pool doesn't wait for the old pool to shutdown its workers before it starts up the new ones), but for most users that's less of an issue than the downtime. Besides, I think that another fork might already have a solution for this. But it's probably worth mentioning that in the README.md.

brasic · 2015-10-06T19:11:03Z

Sure, I can add some documentation in the readme.

We handle old-pool-termination inside our application by a utility that parses the output of ps. So our resque:pool:setup task looks like:

task "resque:pool:setup" do
  # [snip activerecord reconnect stuff]
  obtain_shared_lock
  ResquePoolReaper.shutdown_other_pools
end

It isn't ideal to trust the application to do this - ideally resque-pool would have first-class support for doing this step itself but we didn't want to patch it too much, and the shutdown_other_pools functionality isn't tested on anything other than linux.

If you're open to it I wouldn't mind taking a stab at cleaning up and integrating this functionality into resque-pool itself so that it would be easier for other users to get working.

brasic · 2015-10-06T19:34:12Z

regarding 3), it's true that this method has the potential to use more memory than a more 'hands-on' approach e.g. where the new instance somehow communicated with the old instance and waited for it to release slots before forking each new managed worker. But I think that would be pretty complicated and in practice the memory bump is not very substantial in linux due to copy-on-write.

I'd be interested to see the approach taken by the other fork you mentioned.

nevans · 2015-10-07T18:38:15Z

I'm not a huge fan of ps parsing, but it's better than doing nothing. If it's been working for you for a year on both Linux and Mac OS X, then it'll be fine (at least as a first pass). So go ahead and post what you have. We could also consider using a different approach, e.g. pid-dir instead of pid-file, or register pool pids in redis.

nevans · 2015-10-07T18:55:51Z

backupify's fork detects orphaned worker count via a ps | grep | awk monstrosity (with plenty of other ps usage for other memory management in their fork) and then uses that as a rough delta from the configured worker count. I'd personally prefer a slightly different approach; e.g. create a new custom config_loader that looks at the redis:workers set in redis and diffs against the workers in current pool.

Resque.redis.smembers("workers").
  select {|w| w.starts_with?("#{hostname}:") }.
  map    {|w| w.split(":", 3) }.
  sort_by(&:last).
  each_with_object(Hash.new do |h,k| h[k] = [] end) do |(_host, pid, queues), h|
    h[queues] << Integer(pid)
  end

That approach doesn't work quite so well for multiple pools per hostname. This can be worked around by keeping a separate registry. Or resqued manages this by having another process layer above the pool manager (their top layer is their "manager" and our manager layer is their "listener") and they use a socket to communicate all worker start/stops up from the listeners to the manager and broadcast down from the manager to all other listeners. I'd just as soon use a socket file than create a new process.

brasic · 2015-10-07T19:20:53Z

I would also like to stop parsing ps. I like the idea of a pid directory - a similar option that would simplify the code might be to replace --no-pidfile with an --allow-multi flag that caused the existing pidfile to be treated as a unique newline-separated list of pids.

nevans · 2015-10-08T17:03:33Z

For speed of getting this in, let's see your ps implementation before we spend a lot of time playing around with alternate approaches. My gut is that ps is okay if all we're doing is detecting another pool master on the same host and signalling it to shut down.

Once we get into balancing total running workers (and carefully transitioning from one pool to the other to constrain the max total workers), we'll want something better than ps for that, and we might as well pick an approach that works well for both use cases. Using a shared pid-dir or socket can work for both, but I'm strongly biased towards registering both the pool master and the pool workers in redis for the simple reason that we can also then display info about running pools in resque-web.

nevans · 2015-10-08T17:06:12Z

Also, I'd want to avoid multiple processes writing comma-delimited pids to the same file, for the simple reason that I don't want to deal with race-conditions and/or locks when it's possible to design around them.

joshuaflanagan · 2015-10-10T13:19:11Z

Just wanted to give an update that I'm working on a cleaned up version of our ps implementation that @brasic and I will test and likely submit as a separate PR.

Adds the `--kill-others` command line option. When this option is set, resque-pool will send a graceful shutdown signal to all other running resque-pools. This is useful in "no downtime deploy" scenarios, where you want the pool running the new code to completely load its environment and be ready to process work *before* shutting down the old code. Once the new code is ready, it will shut down the old processes for you. See also resque#132

This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.

This change makes it possible to run multiple copies of resque-pool on the same server using the --daemon flag, which was previously not allowed since the server refuses to start if its configured pidfile already exists. Process supervisors like upstart and systemd already know what pids they are managing, so in this context a pidfile is unnecessary. Without the ability to run concurrent daemon instances, zero-downtime deployment (in which a new resque-pool instance starts up and slowly replaces the previous instance until it is done) is not feasible.

This is to support running resque-pool inside of upstart while allowing for zero-downtime restarts. It enables the use of this strategy: http://orchestrate.io/blog/2014/03/26/tip-seamless-restarts-with-unicorn-and-upstart/ The idea is that when the pool starts, it opens a shared lock that it will hold forever. In the envisioned use-case, upstart is responsible for restarting a resque-pool if it fails. When resque-pool daemonizes, upstart keeps track of the process id. However, to ensure no downtime, we do not stop and start the pool. Instead, when a deploy happens, a new resque-pool instance will start and kill the old one after it is ready to fork workers. This would ordinarily cause upstart to detect that the original process has died and relaunch it, which is not desired behavior. If the upstart init script tries to obtain an exclusive lock, this attempt will block while any pool instance is still running, including ones other than the original process.

This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.

brasic · 2015-10-13T01:18:51Z

I've added sample upstart configs and a readme section that documents the zero-downtime use case, including an explanation of the flag @joshuaflanagan introduced in #137.

nevans · 2015-10-13T15:08:18Z

lib/resque/pool/cli.rb

--pidfile FILE should also override --no-pidfile; opts.delete(:no_pidfile)

Time to add specs for the option parsing, now that it is gaining some non-trivial logic. CLI#parse_options can accept an array of arguments, to simplify testing.

joshuaflanagan · 2015-10-14T02:33:43Z

@nevans I've addressed your concern regarding --pidfile overriding --no-pidfile

Options to aid zero-downtime deployment / process supervision

nevans · 2015-10-14T21:02:18Z

I tweaked the --pidfile --no-pidfile interplay a little bit in 021056d. Thanks, again.

jrochkind · 2019-01-31T15:49:14Z

A systemd example would be super helpful.

brasic · 2019-01-31T15:56:33Z

@jrochkind For what it's worth, under upstart running two instances was necessary but now that we use systemd we run a single instance and use SIGINT to orphan existing jobs on redeploy. Here's the unit definition we have been running for about two years now:

[Unit]
Description=resque pool manager
Documentation=https://github.com/nevans/resque-pool/
Requires=network.target

[Service]
Type=forking
User=<USER>
Group=<GROUP>
WorkingDirectory=<APP DIR>
EnvironmentFile=<ENV FILE>
PIDFile=<PATH TO PIDFILE>
# Allow resque to adjust its nice value (and the value of child processes)
LimitNICE=40
ExecStart=/usr/local/rvm/bin/rvm-shell -c 'bundle exec resque-pool --daemon --pidfile <PATH_TO_PIDFILE>'
# Only kill the main process.  INT will cause resque-pool-master to reparent
# children under PID 1 and exit, leaving existing jobs to run until complete.
KillMode=process
KillSignal=INT
SendSIGHUP=no
SendSIGKILL=no
TimeoutStopSec=60
TimeoutSec=60
Restart=on-failure
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=%n

[Install]
WantedBy=multi-user.target

nevans · 2019-03-08T17:55:54Z

Will gladly (quickly?!) merge any PRs updates to the incredibly out-of-date examples directory. 😉
I've been deploying under docker for a couple of years now, so I ought to put something in there for that!

nevans mentioned this pull request Oct 8, 2015

zero-downtime restarts #54

Closed

joshuaflanagan mentioned this pull request Oct 11, 2015

Option to kill other resque-pool instances on startup #137

Merged

brasic added a commit to ShippingEasy/resque-pool that referenced this pull request Oct 12, 2015

Add README note and example upstart config

cc4caa4

This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.

brasic added 3 commits October 12, 2015 20:13

Add README note and example upstart config

a76afd2

This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.

brasic force-pushed the upstream-pr branch from cc4caa4 to a76afd2 Compare October 13, 2015 01:15

nevans reviewed Oct 13, 2015
View reviewed changes

lib/resque/pool/cli.rb

Copy link

Collaborator

nevans Oct 13, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--pidfile FILE should also override --no-pidfile; opts.delete(:no_pidfile)

An explicit --pidfile arg overrides the --no-pidfile arg

816578b

Time to add specs for the option parsing, now that it is gaining some non-trivial logic. CLI#parse_options can accept an array of arguments, to simplify testing.

nevans added a commit that referenced this pull request Oct 14, 2015

Merge pull request #132 from ShippingEasy/upstream-pr

1f409a4

Options to aid zero-downtime deployment / process supervision

nevans merged commit 1f409a4 into resque:master Oct 14, 2015

nevans mentioned this pull request Oct 14, 2015

Wait for other pool workers to shut down before forking up new workers #139

Open

Conversation

brasic commented Oct 6, 2015

Uh oh!

joshuaflanagan commented Oct 6, 2015

Uh oh!

nevans commented Oct 6, 2015

Uh oh!

brasic commented Oct 6, 2015

Uh oh!

brasic commented Oct 6, 2015

Uh oh!

nevans commented Oct 7, 2015

Uh oh!

nevans commented Oct 7, 2015

Uh oh!

brasic commented Oct 7, 2015

Uh oh!

nevans commented Oct 8, 2015

Uh oh!

nevans commented Oct 8, 2015

Uh oh!

joshuaflanagan commented Oct 10, 2015

Uh oh!

brasic commented Oct 13, 2015

Uh oh!

nevans Oct 13, 2015

Choose a reason for hiding this comment

Uh oh!

joshuaflanagan commented Oct 14, 2015

Uh oh!

nevans commented Oct 14, 2015

Uh oh!

jrochkind commented Jan 31, 2019

Uh oh!

brasic commented Jan 31, 2019

Uh oh!

nevans commented Mar 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments