Question

I am writing a Perl script that will write some inputs and send those inputs to an external program. There is a small but non-zero chance that this program will hang, and I want to time it out:

my $pid = fork;
if ($pid > 0){
    eval{
        local $SIG{ALRM} = sub { die "TIMEOUT!"};
        alarm $num_secs_to_timeout;
        waitpid($pid, 0);
        alarm 0;
    };
}
elsif ($pid == 0){
    exec('echo blahblah | program_of_interest');
    exit(0);
}

As it stands now, after $num_secs_to_timeout, program_of_interest still persists. I tried to kill it in the anonymous subroutine for $SIG{ALRM} as follows:

local $SIG{ALRM} = sub{kill 9, $pid; die "TIMEOUT!"}

but this doesn't do anything. program_of_interest is still persisting. How do I go about killing this process?

Was it helpful?

Solution

I was able to successfully kill my exec()ed process by killing the process group, as shown as the answer to question In perl, killing child and its children when child was created using open. I modified my code as follows:

my $pid = fork;
if ($pid > 0){
    eval{
        local $SIG{ALRM} = sub {kill 9, -$PID; die "TIMEOUT!"};
        alarm $num_secs_to_timeout;
        waitpid($pid, 0);
        alarm 0;
    };
}
elsif ($pid == 0){
    setpgrp(0,0);
    exec('echo blahblah | program_of_interest');
    exit(0);
}

After timeout, program_of_interest is successfully killed.

OTHER TIPS

The above code (by strictlyrude27) didn't work out of the box, because -$PID is spelt in capitals. (BTW: there's also: http://www.gnu.org/software/coreutils/manual/html_node/timeout-invocation.html)

Here's an example with test:

#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;

my $prg = basename $0;
my $num_secs_sleep = 2;
my $num_secs_to_timeout = 1;
my $orig_program = "sleep $num_secs_sleep; echo \"Look ma, survived!\"";
my $program = $orig_program;
my $expect = "";

if (@ARGV){
  if($ARGV[0] eq "test"){
    test();
    exit 0;
  } elsif (@ARGV == 1) {
    $num_secs_to_timeout = $ARGV[0];
  } elsif (@ARGV == 2) {
    $program = $ARGV[0];
    $num_secs_to_timeout = $ARGV[1];
  } else {
    die "Usage: $prg [ \"test\" | [program] seconds ] "
  }
}

if($orig_program eq $program) {
  if(@ARGV < 2) {
    $expect = $num_secs_to_timeout > $num_secs_sleep ?
      "(we expected to survive.)" : "(we expected to TIME OUT!)";
  }
  print STDERR "sleeping: $num_secs_sleep seconds$/";
}

print STDERR <<END;
  timeout after: $num_secs_to_timeout seconds,
  running program: '$program'
END

if($orig_program eq $program) {
  print STDERR "$expect$/";
}

exit Timed::timed($program, $num_secs_to_timeout);

sub test {
  eval "use Test::More qw(no_plan);";
  my $stdout;
  close STDOUT;
  open STDOUT, '>', \$stdout or die "Can't open STDOUT: $!";
  Timed::timed("sleep 1", 3);
  is($stdout, undef);
  Timed::timed("sleep 2", 1);
  is($stdout, "TIME OUT!$/");
}

################################################################################
package Timed;
use strict;
use warnings;

sub timed {
  my $retval;
  my ($program, $num_secs_to_timeout) = @_;
  my $pid = fork;
  if ($pid > 0){ # parent process
    eval{
      local $SIG{ALRM} = 
        sub {kill 9, -$pid; print STDOUT "TIME OUT!$/"; $retval = 124;};
      alarm $num_secs_to_timeout;
      waitpid($pid, 0);
      alarm 0;
    };
    return defined($retval) ? $retval : $?>>8;
  }
  elsif ($pid == 0){ # child process
    setpgrp(0,0);
    exec($program);
  } else { # forking not successful
  }
}

Hmmm your code works for me, after some minor modifications - which I assume are changes made by yourself to make the code into a generic example.

So that leaves me with two ideas:

  1. You removed the problem when you created the sample code - try creating a small sample that actually runs (I had to change 'program_of_interest' and $num_secs_to_timeout to real values to test it). Make sure the sample has the same problem.
  2. It's something to do with the program_of_interest you're running - as far as I know, you can't mask a kill 9, but maybe there's something going on. Have you tried testing your code with a really simple script. I created one for my testing that goes while (1) { print "hi\n"; sleep 1; }
  3. Something else.

Good luck...

The only way SIGKILL can be ignored is if the process is stuck in a system call which is uninterruptible. Check the state of the hung process (with ps aux) if the state is D, then the process can't be killed.

You might also want to check that the function is being called by outputting something from it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top