Question

I'm [mis]managing thin workers with god, and getting an unexpected result that I'm having trouble diagnosing. My god watch block is fairly straightforward:

(1..4).each do |index|
  God.watch do |w|
    name     = "thin-#{index}"
    port     = 8300 + index
    pid_name = "thin.#{port}.pid"
    pid_path = God.pid_file_directory.join pid_name

    # General settings.
    w.name     = name
    w.interval = 5.seconds
    w.pid_file = pid_name

    # Configuration for thin instances.
    start_options = {
      address:     '0.0.0.0',
      environment: RAILS_ENV,
      only:        index,
      port:        port,
    }.inject("") { |s,(k,v)| s << " --#{k}=#{v}" }

    # Process management commands.
    bin       = "/vagrant/bin/thin -c #{RAILS_ROOT}"
    w.start   = "#{bin} #{start_options} start"
    w.restart = "#{bin} --pid #{pid_path} restart"
    w.stop    = "#{bin} --pid #{pid_path} stop"

    # Management conditions.
    w.start_if do |start|
      start.condition(:process_running) do |c|
        # Ensure the worker is running.
        c.interval = 5.seconds
        c.running  = false
      end
    end

  end
end

And it will fire up 4 thin servers, but on the wrong ports. Here's output from running god in debug mode:

vagrant@localhost:/vagrant$ RAILS_ENV=development god -c config/app.god -D
I [2014-04-30 18:27:05]  INFO: Loading config/app.god
I [2014-04-30 18:27:05]  INFO: Syslog enabled.
I [2014-04-30 18:27:05]  INFO: Using pid file directory: /vagrant/tmp/pids
I [2014-04-30 18:27:06]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2014-04-30 18:27:06]  INFO: thin-1 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-1 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-3 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-4 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-4 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-2 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-3 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-2 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06]  INFO: thin-1 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06]  INFO: thin-3 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06]  INFO: thin-4 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06]  INFO: thin-1 move 'up' to 'start'
I [2014-04-30 18:27:06]  INFO: thin-3 move 'up' to 'start'
I [2014-04-30 18:27:06]  INFO: thin-4 move 'up' to 'start'
I [2014-04-30 18:27:06]  INFO: thin-4 start: /vagrant/bin/thin -c /vagrant  --address=0.0.0.0 --environment=development --only=4 --port=8304 start
I [2014-04-30 18:27:06]  INFO: thin-2 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06]  INFO: thin-2 move 'up' to 'start'
I [2014-04-30 18:27:06]  INFO: thin-1 start: /vagrant/bin/thin -c /vagrant  --address=0.0.0.0 --environment=development --only=1 --port=8301 start
I [2014-04-30 18:27:06]  INFO: thin-3 start: /vagrant/bin/thin -c /vagrant  --address=0.0.0.0 --environment=development --only=3 --port=8303 start
I [2014-04-30 18:27:06]  INFO: thin-2 start: /vagrant/bin/thin -c /vagrant  --address=0.0.0.0 --environment=development --only=2 --port=8302 start

But instead of my thin servers listening on 830{1,2.3.4}, they're listening on 830{2,4.6.8} instead:

vagrant@localhost:/vagrant$ ps aux | grep thin
vagrant   4535 49.8  2.3 113712 83936 ?        Sl   18:27   0:03 thin server (0.0.0.0:8306)                                                                                                                                                                                                                                   
vagrant   4541 51.4  2.3 113464 83716 ?        Sl   18:27   0:03 thin server (0.0.0.0:8304)                                                                                                                                                                                                                                   
vagrant   4547 55.5  2.3 113276 83504 ?        Sl   18:27   0:03 thin server (0.0.0.0:8302)                                                                                                                                                                                                                                   
vagrant   4553 54.2  2.3 113740 83928 ?        Sl   18:27   0:03 thin server (0.0.0.0:8308)                                                                                                                                                                                                                                   
vagrant   4569  0.0  0.0   3912   828 pts/0    S+   18:27   0:00 grep --color=auto thin

All of my pid files, log entries, etc show the {2.4.6,8} numbers. Given my code and debug output from god, I'm thinking there's something wrong on thin's end, but I have no idea what to look for.

Also, this pattern (base + 2,4,6,8 instead of base + 1,2,3,4) is repeatable - e.g. if I set the port base to 5555, the results are not 5556, 5557, 5558, and 5559 but instead 5557, 5559, 5561, and 5563.

Am I missing something from the thin documentation? Is there some conflict and I need to tell god to stagger the start commands by a few seconds? Are there additional diagnostic tools I could be using? I'm stuck in the mud here.


EDIT1

If try to "go with the flow" and change

(1..4).each do |index|

to

[2,4,6,8].each do |index|

then instead of {2,4,6,8}, I get {4,8,12,16} which leaves me just as confused as before.


EDIT2

I'm now pretty certain this has nothing to do with god, so I'm changing the question title. Here's new evidence:

vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant  --address=0.0.0.0 --environment=development --only=1 --port=8301 --servers=1 start
Starting server on 0.0.0.0:8302 ... 

Why does thin refuse to start on the port which I request?


EDIT3

Breaking this down to the simplest case, I find the culprit:

vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant --port=8301 start
Using rack adapter
Thin web server (v1.6.2 codename Doc Brown)
Maximum connections set to 1024
Listening on 0.0.0.0:8301, CTRL+C to stop
^CStopping ...
vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant --only=1 --port=8301 start
Starting server on 0.0.0.0:8302 ... 
vagrant@localhost:/vagrant$ ^C

So for some reason the --only directive causes misbehavior vis a vis port binding. The problem now is that I lose control over where thin is placing its pid file when I don't specify --only=1.

Was it helpful?

Solution

Found the answer somewhere in between all the things I'd tried. In final form, I've got the following:

# Configuration for thin servers.
start_options = {
  address:     '0.0.0.0',
  chdir:       RAILS_ROOT,
  environment: RAILS_ENV,
  port:        port,
}.inject("") { |s,(k,v)| s << " --#{k}=#{v}" }

# Process management commands.
bin       = "/vagrant/bin/thin"
# NOTE: pid not specified during start
w.start   = "#{bin} #{start_options} start"
# NOTE: pid specified for restart and stop
w.restart = "#{bin} --pid tmp/pids/thin-#{index}.pid restart"
w.stop    = "#{bin} --pid tmp/pids/thin-#{index}.pid stop"

Setting God.pid_file_directory = 'tmp/pids' while omitting w.pid_file allows god to place pid files "naturally" (e.g. based on process name like tmp/pids/thin-1.pid). Also it's important in w.start to omit --pid, --daemonize, and --only to prevent thin from daemonizing and clustering itself. In this setup, god is handling all of those details so I need to keep thin from trying outsmart itself.

My previous configurations bungled this clean separation, and the result was that god was trying to manage thin, but thin was also trying to manage itself. Much like an overbearing parent in a screaming match with a rebellious teenager, the more control each attempted to exert over the situation, the more twisted and unstable the overall system became.

The last piece of the puzzle - which enables full process control via god - is that --pid (while unspecified during start) must be passed to thin during restart and stop - thin doesn't know which process to signal at that point without a PID since god is (rightly) managing the daemonization.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top