Using Upstart to manage Unicorn w/ rbenv + bundler binstubs w/ ruby-local-exec shebang

https://stackoverflow.com/questions/8667425

08-04-2021
|

Question

Alright, this is melting my brain. It might have something to do with the fact that I don't understand Upstart as well as I should. Sorry in advance for the long question.

I'm trying to use Upstart to manage a Rails app's Unicorn master process. Here is my current /etc/init/app.conf:

description "app"

start on runlevel [2]
stop on runlevel [016]

console owner

# expect daemon

script
  APP_ROOT=/home/deploy/app
  PATH=/home/deploy/.rbenv/shims:/home/deploy/.rbenv/bin:$PATH
  $APP_ROOT/bin/unicorn -c $APP_ROOT/config/unicorn.rb -E production # >> /tmp/upstart.log 2>&1
end script

# respawn

That works just fine - the Unicorns start up great. What's not great is that the PID detected is not of the Unicorn master, it's of an sh process. That in and of itself isn't so bad, either - if I wasn't using the automagical Unicorn zero-downtime deployment strategy. Because shortly after I send -USR2 to my Unicorn master, a new master spawns up, and the old one dies...and so does the sh process. So Upstart thinks my job has died, and I can no longer restart it with restart or stop it with stop if I want.

I've played around with the config file, trying to add -D to the Unicorn line (like this: $APP_ROOT/bin/unicorn -c $APP_ROOT/config/unicorn.rb -E production -D) to daemonize Unicorn, and I added the expect daemon line, but that didn't work either. I've tried expect fork as well. Various combinations of all of those things can cause start and stop to hang, and then Upstart gets really confused about the state of the job. Then I have to restart the machine to fix it.

I think Upstart is having problems detecting when/if Unicorn is forking because I'm using rbenv + the ruby-local-exec shebang in my $APP_ROOT/bin/unicorn script. Here it is:

#!/usr/bin/env ruby-local-exec
#
# This file was generated by Bundler.
#
# The application 'unicorn' is installed as part of a gem, and
# this file is here to facilitate running it.
#

require 'pathname'
ENV['BUNDLE_GEMFILE'] ||= File.expand_path("../../Gemfile",
  Pathname.new(__FILE__).realpath)

require 'rubygems'
require 'bundler/setup'

load Gem.bin_path('unicorn', 'unicorn')

Additionally, the ruby-local-exec script looks like this:

#!/usr/bin/env bash
#
# `ruby-local-exec` is a drop-in replacement for the standard Ruby
# shebang line:
#
#    #!/usr/bin/env ruby-local-exec
#
# Use it for scripts inside a project with an `.rbenv-version`
# file. When you run the scripts, they'll use the project-specified
# Ruby version, regardless of what directory they're run from. Useful
# for e.g. running project tasks in cron scripts without needing to
# `cd` into the project first.

set -e
export RBENV_DIR="${1%/*}"
exec ruby "$@"

So there's an exec in there that I'm worried about. It fires up a Ruby process, which fires up Unicorn, which may or may not daemonize itself, which all happens from an sh process in the first place...which makes me seriously doubt the ability of Upstart to track all of this nonsense.

Is what I'm trying to do even possible? From what I understand, the expect stanza in Upstart can only be told (via daemon or fork) to expect a maximum of two forks.

Solution

your upstart job needs to be configured so that upstart knows exactly how many times it forks. And it can only fork once or twice, no more.

In unix land there are two key system calls that facilitate running programs: fork and exec.

fork copies the process that calls it. One process calls fork, and it returns control back to two processes. Each process must identify which it is (the parent or the child) from the value returned by fork (see the man page for details).

exec runs a new program, replacing the process that called exec.

When you simply run a command in a shell, the under the hood the shell calls fork to create a new process with its own id, and that new process (after some setup) immediately calls exec to start the command you typed. This is how most programs are run, whether by shell or your window manager or whatever. See the system function in C, which also has variants in most of the scripting languages.

If you think it's inefficient, you're probably right. It's how it has been done in unix since days of yore, and apparnetly nobody is game to change it. One of the reasons is that there are many things that are not replaced on exec, including (sometimes) open files, and the process's user and group ids.

Another reason is that a LOT of effort has been spent making fork efficient, and they have actually done a pretty good job of it - in modern unixes (with the help of the CPU) fork actually copies very little of the process. I guess nobody wants to throw all that work away.

And, (pause for effect) the processes pid.

To demonstrate:

mslade@mickpc:~$ echo $$
3652
mslade@mickpc:~$ bash
mslade@mickpc:~$ echo $$
6545
mslade@mickpc:~$ exec bash
mslade@mickpc:~$ echo $$
6545
mslade@mickpc:~$ exit
exit
mslade@mickpc:~$ echo $$
3652

Most of the popular languages have variations of fork and exec, including shell, C, perl, ruby and python. But not java.

So with all that in mind, what you need to do to make your upstart job work is make sure that it forks the same number of times as upstart thinks it does.

The exec line in ruby-local-exec is actually a good thing, it prevents a fork. Also load doesn't start a new process, it just loads the code into the existing ruby interpreter and runs it.

However your shell script forks in this line:

$APP_ROOT/bin/unicorn -c $APP_ROOT/config/unicorn.rb -E production # >> /tmp/upstart.log 2>&1

to prevent this you can just change it to

exec $APP_ROOT/bin/unicorn -c $APP_ROOT/config/unicorn.rb -E production # >> /tmp/upstart.log 2>&1

If you do that, AFAICT unicorn should not fork at all, and you won't need to tell upstart to expect a fork.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow