I'm [mis]managing thin workers with god, and getting an unexpected result that I'm having trouble diagnosing. My god watch
block is fairly straightforward:
(1..4).each do |index|
God.watch do |w|
name = "thin-#{index}"
port = 8300 + index
pid_name = "thin.#{port}.pid"
pid_path = God.pid_file_directory.join pid_name
# General settings.
w.name = name
w.interval = 5.seconds
w.pid_file = pid_name
# Configuration for thin instances.
start_options = {
address: '0.0.0.0',
environment: RAILS_ENV,
only: index,
port: port,
}.inject("") { |s,(k,v)| s << " --#{k}=#{v}" }
# Process management commands.
bin = "/vagrant/bin/thin -c #{RAILS_ROOT}"
w.start = "#{bin} #{start_options} start"
w.restart = "#{bin} --pid #{pid_path} restart"
w.stop = "#{bin} --pid #{pid_path} stop"
# Management conditions.
w.start_if do |start|
start.condition(:process_running) do |c|
# Ensure the worker is running.
c.interval = 5.seconds
c.running = false
end
end
end
end
And it will fire up 4 thin servers, but on the wrong ports. Here's output from running god in debug mode:
vagrant@localhost:/vagrant$ RAILS_ENV=development god -c config/app.god -D
I [2014-04-30 18:27:05] INFO: Loading config/app.god
I [2014-04-30 18:27:05] INFO: Syslog enabled.
I [2014-04-30 18:27:05] INFO: Using pid file directory: /vagrant/tmp/pids
I [2014-04-30 18:27:06] INFO: Started on drbunix:///tmp/god.17165.sock
I [2014-04-30 18:27:06] INFO: thin-1 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-1 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-3 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-4 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-4 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-2 move 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-3 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-2 moved 'unmonitored' to 'up'
I [2014-04-30 18:27:06] INFO: thin-1 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06] INFO: thin-3 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06] INFO: thin-4 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06] INFO: thin-1 move 'up' to 'start'
I [2014-04-30 18:27:06] INFO: thin-3 move 'up' to 'start'
I [2014-04-30 18:27:06] INFO: thin-4 move 'up' to 'start'
I [2014-04-30 18:27:06] INFO: thin-4 start: /vagrant/bin/thin -c /vagrant --address=0.0.0.0 --environment=development --only=4 --port=8304 start
I [2014-04-30 18:27:06] INFO: thin-2 [trigger] process is not running (ProcessRunning)
I [2014-04-30 18:27:06] INFO: thin-2 move 'up' to 'start'
I [2014-04-30 18:27:06] INFO: thin-1 start: /vagrant/bin/thin -c /vagrant --address=0.0.0.0 --environment=development --only=1 --port=8301 start
I [2014-04-30 18:27:06] INFO: thin-3 start: /vagrant/bin/thin -c /vagrant --address=0.0.0.0 --environment=development --only=3 --port=8303 start
I [2014-04-30 18:27:06] INFO: thin-2 start: /vagrant/bin/thin -c /vagrant --address=0.0.0.0 --environment=development --only=2 --port=8302 start
But instead of my thin servers listening on 830{1,2.3.4}, they're listening on 830{2,4.6.8} instead:
vagrant@localhost:/vagrant$ ps aux | grep thin
vagrant 4535 49.8 2.3 113712 83936 ? Sl 18:27 0:03 thin server (0.0.0.0:8306)
vagrant 4541 51.4 2.3 113464 83716 ? Sl 18:27 0:03 thin server (0.0.0.0:8304)
vagrant 4547 55.5 2.3 113276 83504 ? Sl 18:27 0:03 thin server (0.0.0.0:8302)
vagrant 4553 54.2 2.3 113740 83928 ? Sl 18:27 0:03 thin server (0.0.0.0:8308)
vagrant 4569 0.0 0.0 3912 828 pts/0 S+ 18:27 0:00 grep --color=auto thin
All of my pid files, log entries, etc show the {2.4.6,8} numbers. Given my code and debug output from god, I'm thinking there's something wrong on thin's end, but I have no idea what to look for.
Also, this pattern (base + 2,4,6,8 instead of base + 1,2,3,4) is repeatable - e.g. if I set the port base to 5555, the results are not 5556, 5557, 5558, and 5559 but instead 5557, 5559, 5561, and 5563.
Am I missing something from the thin documentation? Is there some conflict and I need to tell god to stagger the start
commands by a few seconds? Are there additional diagnostic tools I could be using? I'm stuck in the mud here.
EDIT1
If try to "go with the flow" and change
(1..4).each do |index|
to
[2,4,6,8].each do |index|
then instead of {2,4,6,8}, I get {4,8,12,16} which leaves me just as confused as before.
EDIT2
I'm now pretty certain this has nothing to do with god, so I'm changing the question title. Here's new evidence:
vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant --address=0.0.0.0 --environment=development --only=1 --port=8301 --servers=1 start
Starting server on 0.0.0.0:8302 ...
Why does thin refuse to start on the port which I request?
EDIT3
Breaking this down to the simplest case, I find the culprit:
vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant --port=8301 start
Using rack adapter
Thin web server (v1.6.2 codename Doc Brown)
Maximum connections set to 1024
Listening on 0.0.0.0:8301, CTRL+C to stop
^CStopping ...
vagrant@localhost:/vagrant$ /vagrant/bin/thin -c /vagrant --only=1 --port=8301 start
Starting server on 0.0.0.0:8302 ...
vagrant@localhost:/vagrant$ ^C
So for some reason the --only
directive causes misbehavior vis a vis port binding. The problem now is that I lose control over where thin is placing its pid file when I don't specify --only=1
.