I have an old application written in C++ which I am porting to Ruby.
One section of the code uses execl()
, in order to replace the process with a[n updated] copy of itself while maintaining open file descriptors (this application is a network service).
if ( execl( "./my-app", "-restart", fd.c_str(), NULL ) < 0 ) {
Didn't take long to figure out that Ruby has no execl()
equivalent, but that you can fake it in part using Process::spawn
and the :close_others
option. Or, at least I should be able to according to the documentation:
file descriptor inheritance: close non-redirected non-standard fds (3, 4, 5, ...) or not
:close_others => true : don't inherit
So, it seems to me that the following should spawn a new process which has access to all open file descriptors of the parent:
server_fd = @server.to_i
env = {
"APP_REBOOT" => "true",
"APP_SERVER_FD" => server_fd.to_s,
}
command = "ruby my-app.rb"
options = {
:in => :in,
:out => :out,
:err => :err,
:close_others => false,
}
pid = Process.spawn env, command, options
Process.detach pid
Which will allow the child access to the descriptors... however I cannot figure out how to exit
the parent process without closing all the descriptors. In other words, if I cause the parent to exit
at the end of the code:
server_fd = @server.to_i
env = {
"APP_REBOOT" => "true",
"APP_SERVER_FD" => server_fd.to_s,
}
command = "ruby my-app.rb"
options = {
:in => :in,
:out => :out,
:err => :err,
:close_others => false,
}
pid = Process.spawn env, command, options
Process.detach pid
exit # ADDED THIS LINE
Then the descriptors are also closed for the child.
I have a feeling this is more a problem with my approach to process management than something specific to Ruby but I don't see what I'm doing wrong.
$ ruby -v
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux]
EDIT1
Just before my call to Process.spawn
(or Process.exec
as @mata points out) I have a diagnostic output:
system 'lsof -c ruby'
And another call to that just inside my recover_from_reboot
method. This is the tail of output pre-reboot, you can see the listening server port and a connected client on the last two lines:
ruby 8957 chris 0u CHR 136,1 0t0 4 /dev/pts/1
ruby 8957 chris 1u CHR 136,1 0t0 4 /dev/pts/1
ruby 8957 chris 2u CHR 136,1 0t0 4 /dev/pts/1
ruby 8957 chris 3r FIFO 0,8 0t0 12213372 pipe
ruby 8957 chris 4w FIFO 0,8 0t0 12213372 pipe
ruby 8957 chris 5r FIFO 0,8 0t0 12213373 pipe
ruby 8957 chris 6w FIFO 0,8 0t0 12213373 pipe
ruby 8957 chris 7u IPv4 12213374 0t0 TCP localhost.localdomain:boks-servc (LISTEN)
ruby 8957 chris 8u IPv4 12213423 0t0 TCP localhost.localdomain:boks-servc->localhost.localdomain:45249 (ESTABLISHED)
And this is what I see post-reboot:
ruby 8957 chris 3r FIFO 0,8 0t0 12203947 pipe
ruby 8957 chris 4w FIFO 0,8 0t0 12203947 pipe
ruby 8957 chris 5r FIFO 0,8 0t0 12203948 pipe
ruby 8957 chris 6w FIFO 0,8 0t0 12203948 pipe
Again this is whether I try spawn
or exec
.
EDIT2
Given my diagnostic output, I see that the server keeps binding to fd 7, and the client to 8. By adding
7 => 7,
8 => 8,
to my options
array, I am able to successfully persist these sockets across reboot using exec
. It is feasible for me to manually add the server and [client1, client2,...]
fd
s to the options hash, but this seems very dirty when :close_others
is supposed to do the heavy lifting for me.