How to gracefully restart delayed_job consumers?

Question

Came up with a solution that works.

I have a base class that all of my delayed jobs inherit from called BaseJob:

class BaseJob
  attr_accessor :live_hash

  def before(job)
    # check to make sure that the version of code here is the right version of code
    resp = HTTParty.get("#{Rails.application.config.root_url}/revision")
    self.live_hash = resp.body.strip
  end

  def should_perform()
    return self.live_hash == GIT_REVISION
  end

  def perform()
    if self.should_perform == true
      self.safe_perform()
    end
  end

  def safe_perform()
    # override this method in subclasses
  end

  def success(job)
    if self.should_perform == false
      # log stats here about a failure

      # enqueue a new job of the same kind
      new_job = DelayedJob.new
      new_job.priority = job.priority
      new_job.handler = job.handler
      new_job.queue = job.queue
      new_job.run_at = job.run_at
      new_job.save
      job.delete

      # restart the delayed job system
      %x("export RAILS_ENV=#{Rails.env} && ./script/delayed_job stop")
    else
      # log stats here about a success
    end
  end

end

All base classes inherit from BaseJob and override safe_perform to actually do their work. A few assumptions about the above code:

Rails.application.config.root_url points to the root of your app (ie: www.myapp.com)
There is a route exposed called /revision (ie: www.myapp.com/revision)
There is a global constant called GIT_REVISION that your app knows about

What I ended up doing was putting the output of git rev-parse HEAD in a file and pushing that with the code. That gets loaded in upon startup so it's available in the web version as well as in the delayed_job consumers.

When we deploy code via Capistrano, we no longer stop, start, or restart delayed_job consumers. We install a cronjob on consumer nodes that runs every minute and determines if a delayed_job process is running. If one isn't, then a new one will be spawned.

As a result of all of this, all of the following conditions are met:

Pushing code doesn't wait on delayed_job to restart/force kill anymore. Existing jobs that are running are left alone when new code is pushed.
We can detect when a job begins if the consumer is running old code. The job gets requeued and the consumer kills itself.
When a delayed_job dies, a new one is spawned via a cronjob with new code (by the nature of starting delayed_job, it has new code).
If you're paranoid about killing delayed_job consumers, install a nagios check that does the same thing as the cron job but alerts you when a delayed_job process hasn't been running for 5 minutes.