Question

So here's the lowdown:

The client i'm developing for is on HostGator, which has limited their max_execution_time to 30 seconds and it cannot be overridden (I've tried and confirmed it cannot be via their support and wiki)

What I'm have the code doing is take an uploaded file and...

  1. loop though the xml
  2. get all feed download links within the file
  3. download each xml file
  4. individually loop though each xml array of each file and insert the information of each item into the database based on where they are from (i.e. the filename)

Now is there any way I can queue this somehow or split the workload into multiple files possibly? I know the code works flawlessly and checks to see if each item exists before inserting it but I'm stuck getting around the execution_limit.

Any suggestions are appreciated, let me know if you have any questions!

Was it helpful?

Solution

The timelimit is in effect only when executing PHP scripts through a webserver, if you execute the script from CLI or as a background process, it should work fine.

Note that executing an external script is somewhat dangerous if you are not careful enough, but it's a valid option.

Check the following resources:

Process Control Extensions

And specifically:

pcntl-exec

pcntl-fork

OTHER TIPS

Did you know you can trick the max_execution_time by registering a shutdown handler? Within that code you can run for another 30 seconds ;-)

Okay, now for something more useful.

You can add a small queue table in your database to keep track of where you are in case the script dies mid-way.

  • After getting all the download links, you add those to the table
  • Then you download one file and process it; when you're done, you check them off (delete from) from the queue
  • Upon each run you check if there's still work left in the queue

For this to work you need to request that URL a few times; perhaps use JavaScript to keep reloading until the work is done?

I am in such a situation. My approach is similar to Jack's

  • accept that execution time limit will simply be there
  • design the application to cope with sudden exit (look into register_shutdown_function)
  • identify all time-demanding parts of the process
  • continuously save progress of the process
  • modify your components so that they are able to start from arbitrary point, e.g. a position in a XML file or continue downloading your to-be-fetched list of XML links

For the task I made two modules, Import for the actual processing; TaskManagement for dealing with these tasks.
For invoking TaskManager I use CRON, now this depends on what webhosting offers you, if it's enough. There's also a WebCron.

Jack's JavaScript method's advantage is that it only adds requests if needed. If there are no tasks to be executed, the script runtime will be very short and perhaps overstated*, but still. The downsides are it requires user to wait the whole time, not to close the tab/browser, JS support etc. *) Likely much less demanding than 1 click of 1 user in such moment

Then of course look into performance improvements, caching, skipping what's not needed/hasn't changed etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top