Question

I'm trying to figure out the most efficient way to running a pretty hefty PHP task thousands of times a day. It needs to make an IMAP connection to Gmail, loop over the emails, save this info to the database and save images locally.

Running this task every so often using a cron isn't that big of a deal, but I need to run it every minute and I know eventually the crons will start running on top of each other and cause memory issues.

What is the next step up when you need to efficiently run a task multiple times a minute? I've been reading about beanstalk & pheanstalk and I'm not entirely sure if that will do what I need. Thoughts???

Was it helpful?

Solution

Either create a locking mechanism so the scripts won't overlap. This is quite simple as scripts only run every minute, a simple .lock file would suffice:

<?php
  if (file_exists("foo.lock")) exit(0);
  file_put_contents("foo.lock", getmypid());

  do_stuff_here();

  unlink("foo.lock");
?>

This will make sure scripts don't run in parallel, you just have to make sure the .lock file is deleted when the program exits, so you should have a single point of exit (except for the exit at the beginning).

A good alternative - as Brian Roach suggested - is a dedicated server process that runs all the time and keeps the connection to the IMAP server up. This reduces overhead a lot and is not much harder than writing a normal php script:

<?php
  connect();
  while (is_world_not_invaded_by_aliens())
  {
    get_mails();
    get_images();
    sleep(time_to_next_check());
  }
  disconnect();
?>

OTHER TIPS

I'm not a PHP guy but ... what prevents you from running your script as a daemon? I've written many a perl script that does just that.

I've got a number of scripts like these, where I don't want to run them from cron in case they stack-up.

#!/bin/sh
php -f fetchFromImap.php
sleep 60
exec $0

The exec $0 part starts the script running again, replacing itself in memory, so it will run forever without issues. Any memory the PHP script uses is cleaned up whenever it exits, so that's not a problem either.

A simple line will start it, and put it into the background:

cd /x/y/z ; nohup ./loopToFetchMail.sh &

or it can be similarly started when the machine starts with various means (such as Cron's '@reboot ....')

fcron http://fcron.free.fr/ will not start new job if old one is still running, Your could use @ 1 command and not worry about race conditions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top