Options to aid zero-downtime deployment / process supervision#132
Options to aid zero-downtime deployment / process supervision#132nevans merged 4 commits intoresque:masterfrom
Conversation
|
👍 This code has been running reliably for us for almost a year. |
|
Nice, and simple too!
|
|
Sure, I can add some documentation in the readme. We handle old-pool-termination inside our application by a utility that parses the output of task "resque:pool:setup" do
# [snip activerecord reconnect stuff]
obtain_shared_lock
ResquePoolReaper.shutdown_other_pools
endIt isn't ideal to trust the application to do this - ideally resque-pool would have first-class support for doing this step itself but we didn't want to patch it too much, and the If you're open to it I wouldn't mind taking a stab at cleaning up and integrating this functionality into resque-pool itself so that it would be easier for other users to get working. |
|
regarding 3), it's true that this method has the potential to use more memory than a more 'hands-on' approach e.g. where the new instance somehow communicated with the old instance and waited for it to release slots before forking each new managed worker. But I think that would be pretty complicated and in practice the memory bump is not very substantial in linux due to copy-on-write. I'd be interested to see the approach taken by the other fork you mentioned. |
|
I'm not a huge fan of |
|
backupify's fork detects orphaned worker count via a Resque.redis.smembers("workers").
select {|w| w.starts_with?("#{hostname}:") }.
map {|w| w.split(":", 3) }.
sort_by(&:last).
each_with_object(Hash.new do |h,k| h[k] = [] end) do |(_host, pid, queues), h|
h[queues] << Integer(pid)
endThat approach doesn't work quite so well for multiple pools per hostname. This can be worked around by keeping a separate registry. Or resqued manages this by having another process layer above the pool manager (their top layer is their "manager" and our manager layer is their "listener") and they use a socket to communicate all worker start/stops up from the listeners to the manager and broadcast down from the manager to all other listeners. I'd just as soon use a socket file than create a new process. |
|
I would also like to stop parsing |
|
For speed of getting this in, let's see your Once we get into balancing total running workers (and carefully transitioning from one pool to the other to constrain the max total workers), we'll want something better than |
|
Also, I'd want to avoid multiple processes writing comma-delimited pids to the same file, for the simple reason that I don't want to deal with race-conditions and/or locks when it's possible to design around them. |
|
Just wanted to give an update that I'm working on a cleaned up version of our |
Adds the `--kill-others` command line option. When this option is set, resque-pool will send a graceful shutdown signal to all other running resque-pools. This is useful in "no downtime deploy" scenarios, where you want the pool running the new code to completely load its environment and be ready to process work *before* shutting down the old code. Once the new code is ready, it will shut down the old processes for you. See also resque#132
This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.
This change makes it possible to run multiple copies of resque-pool on the same server using the --daemon flag, which was previously not allowed since the server refuses to start if its configured pidfile already exists. Process supervisors like upstart and systemd already know what pids they are managing, so in this context a pidfile is unnecessary. Without the ability to run concurrent daemon instances, zero-downtime deployment (in which a new resque-pool instance starts up and slowly replaces the previous instance until it is done) is not feasible.
This is to support running resque-pool inside of upstart while allowing for zero-downtime restarts. It enables the use of this strategy: http://orchestrate.io/blog/2014/03/26/tip-seamless-restarts-with-unicorn-and-upstart/ The idea is that when the pool starts, it opens a shared lock that it will hold forever. In the envisioned use-case, upstart is responsible for restarting a resque-pool if it fails. When resque-pool daemonizes, upstart keeps track of the process id. However, to ensure no downtime, we do not stop and start the pool. Instead, when a deploy happens, a new resque-pool instance will start and kill the old one after it is ready to fork workers. This would ordinarily cause upstart to detect that the original process has died and relaunch it, which is not desired behavior. If the upstart init script tries to obtain an exclusive lock, this attempt will block while any pool instance is still running, including ones other than the original process.
This explains functionality from resque#132 and resque#137 and provides example upstart config files that will allow for zero-downtime deploys.
|
I've added sample upstart configs and a readme section that documents the zero-downtime use case, including an explanation of the flag @joshuaflanagan introduced in #137. |
There was a problem hiding this comment.
--pidfile FILE should also override --no-pidfile; opts.delete(:no_pidfile)
Time to add specs for the option parsing, now that it is gaining some non-trivial logic. CLI#parse_options can accept an array of arguments, to simplify testing.
|
@nevans I've addressed your concern regarding |
Options to aid zero-downtime deployment / process supervision
|
I tweaked the |
|
A systemd example would be super helpful. |
|
@jrochkind For what it's worth, under upstart running two instances was necessary but now that we use systemd we run a single instance and use SIGINT to orphan existing jobs on redeploy. Here's the unit definition we have been running for about two years now: |
|
Will gladly (quickly?!) merge any PRs updates to the incredibly out-of-date examples directory. 😉 |
This PR includes the remaining differences in ShippingEasy's resque-pool fork, both of which are optional command line flags that tweak startup behavior to support zero-downtime deployment strategies and running resque-pool inside a process supervisor.
The normal approach to deploying new worker code is to signal resque-pool to gracefully shutdown with
QUITorINT, then start up a new instance. The problem with that approach is that long-running jobs and slow app startup can lead to a dramatic reduction in worker throughput during a deploy, which in our case is unacceptable.The technique we use to maintain full throughput during deploys is to start a second resque-pool instance with the new code which loads a rails environment and gracefully shuts down the previous instance only when its children are ready to accept work. This requires two new command line options:
--no-pidfile: Launching a second resque-pool instance in daemon mode was not possible because it detects the prior instance by its pidfile and refuses to start. This option causes the daemon to skip pidfile creation so multiple instances can run at the same time. Of course, this should only be used if the daemon is run under a supervisor like upstart or systemd that can automatically detect the pids of managed processes.--lock FILE: Even though upstart and systemd can detect the pids of processes they manage, neither copes well with code redeploys, where an entirely new process replaces the existing one. For this we rely on a approach adapted from this post where resque-pool opens a shared filesystem lock at startup on a designated file. Deploys cause the new process to open an additional shared lock on the same file before the old process exits, meaning that as long as at least one instance is running the file will always be locked in shared mode. The process supervisor starts resque-pool then immediately attempts to open an exclusive lock on the same file. This will block until all shared locks are cleared, so the supervisor will only attempt to restart resque-pool when there are truly no instances running. In this way the shared lock file acts as surrogate pid file, except it can be shared across redeployed processes.This approach is a little unusual but it works well for us - we are able to deploy many times per day under heavy load with zero impact on job throughput or latency. We would like to switch from our fork to the official version, so I hope these contributions will be useful to others.