Question

I've got a PHP script on a shared webhost that selects from ~300 'feeds' the 40 that haven't been updated in the last half hour, makes a cURL request and then delivers it to the user.

SELECT * FROM table WHERE latest_scan < NOW() - INTERVAL 30 MINUTE ORDER BY latest_scan ASC LIMIT 0, 40;
// Make cURL request and process it

I want to be able to deliver updates as fast as possible, but don't want to bog down my server or the servers I'm fetching from (it's only a handful).

How often should I run the cron job, and should I limit the number of fetches per run? To how many?

Was it helpful?

Solution

It would be a good thing to "rate" how often each feed actually changes so if something has an average time of 24 hours per change, then you just fetch is every 12 hours.

Just store #changes and #try's and pick the ones you need to check... you can run the script every minute and let some statistics do the rest!

OTHER TIPS

On a shared host you might also run into script run time issues. For instance, if your script runs longer than 30 seconds the server may terminate. If this is the case for your host, you might want to do some tests/logging of how long it takes to process each feed and take that into consideration when you figure out how many feeds you should process at the same time.

Another thing I had to do to help fix this was mark the "last scan" as updated before I processed each individual request so that a problem feed would not continue to fail and be picked up for each cron run. If desired, you can update the entry again on failure and specify a reason (if known) why the failure occurred.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top