Domanda

I am using supervisor (http://supervisord.org/) to daemonize a fairly standard PHP script. The script is structured something like:

while (1) {
//  Do a SQL select
//  for any matching rows, do something
//  if I have been running for longer than 60 mins, exit
}

Today, this script (which has been fairly stable for some time now), hung. It did not crash (ie issue SIGHUP or SIGTERM signals) which would have alerted supervisord to restart the process. It did not encounter any errors in its processing, which would have either been caught by the script, or at least have triggered a fatal error and exited. Instead of these "catchable" scenarios, it just sat there. We do have a cron job setup to run every hour to restart the script through the supervisorctl hook, because it seems to be generally accepted that PHP scripts are leaky in terms of memory and would do well to be restarted if running long. The script resumed operations normally after that reboot.

My question: how can I detect that this script has hung? I can't even begin to diagnose or troubleshoot this problem of why it has hung, if I am not somehow alerted to that state. I am looking for either a software solution to this, or some approach that I can take to author a solution myself ( in either PHP, Python, perl or shell).

The script is written in PHP 5.2.6, and runs on a uptodate RHEL 5 server.

Please let me know if I can share any additional information if it will help with a more awesome solution.

Thank you!

Shaheeb R.

È stato utile?

Soluzione

Since this is a case where the script is hanging, PHP possibly may not process any additional code that could detect this hang. For this reason, I suggest modifying the script to keep a log. This would allow the main script to let anything outside of it know it is still running, and with some well placed updates it can also help pinpoint where things have gone awry.

The logging can be written to a file or database, and should contain at least an indicator of the scripts status, such as a last modified date. If this script is not constantly running, then something should also indicate it is running or has stopped. In the example you gave, the log writing would occur within the while loop at least once, possibly more. It costs time/resources to open the pointers or DB connection, so I recommend logging only what is needed. (Note: If using the text file approach, the file would need to be closed right after each write.)

Example:

while (1) {
    log('Running SQL select');
    //  Do a SQL select
    log('Results retrieved');
    //  for any matching rows, do something
    //  (check log) if I have been running for longer than 60 mins, exit
}

function log($msg) {
    // Write timestamp, $msg to log
}

A separate script would need to check the log and report any errors, which could be problematic if it's affected by what's making the main script hang, but I can't think of an alternative.

In regards to memory, if you are not already using mysql_free_result you should give give it a try.

Altri suggerimenti

My suggestion would be similar to what @Shroder described, but taking it a little further. With each run you would create a log/db entry, it would be timestamped + transaction aware (you would update the transaction at start of run to processing and then when done, sign off the entry with completed.

On the side you would run a simple cron check, and see if current time is larger than your trigger (60 minutes, etc) by using the timestamp and transaction state. At that point you throw an alert, etc;

It's quite simple! Just calculate the difference in time from the start of the loop to the current execution point.

$starttime = microtime(true);
while (1) 
{
    //Do your stuff here
    //More SQL, whatever you need


    //Put this at the end of the loop
    $curtime = microtime(true);
    $timetaken = $curtime - $starttime;
    if($timetaken > (60 * 60))
    {
        break;
    }
}

microtime(true) will return the seconds since the Unix epoch, so if we subtract the time we start from the current time, we get time taken/elapsed and exit the loop if it's over 60*60 seconds.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top