Question

We're using the Zend_PDF module in SugarCRM to merge pdf invoices that our system generates. I have been able to successfully merge a number of PDFs (around 10 to 30 in my tests), but we're getting memory errors when we try to merge larger numbers of pdf files. The error looks something like this:

[30-Jan-2012 14:10:20] PHP Fatal error: Allowed memory size of 268435456 bytes exhausted at /usr/local/src/php-5.3.8/Zend/zend_operators.c:1265 (tried to allocate 68134 bytes) in /srv/www/htdocs/sugar6_mf/Zend/Pdf/Element/Object/Stream.php on line 442

The above error was generated when we tried to merge 457 pdf files - that's files, not pages. We're going to need to merge 5,000 and more at a time eventually.

Can anyone offer any help/advice on how to address this?

If needed, ask, and I'll post the code on how the merged pdf is being generated.

Thanks.

Was it helpful?

Solution

I should preface this answer by saying that I know nothing about SugarCRM - my response is based solely on my knowledge of Zend_Pdf.

If my understanding is correct, you have a PHP script (hopefully not running inside Apache considering the length of time it will take to process 5,000 files) that is taking multiple PDF files as input using the Zend_Pdf::load() method and then iterating through the pages of each PDF object and adding them to one target instance of Zend_Pdf, which you are then writing to a file using the save() method.

Using this approach, even if you unset() each of the source PDF objects after you've added the pages to the target PDF object, you'll still need enough memory to store the entire output file. If you blew through 250MB with only 457 files, then I'm guessing your input PDF files are probably about 500KB, so your output file is going to be absolutely huge, so you are still going to end up running out of memory.

My advice would be to ditch this method entirely and use pdftk instead, which you could invoke using the exec() function. I'm sure there's a limit to the size of the arguments you can provide to exec(), so it will probably be a multi-step process with several intermediate files, but ultimately I think this will be a faster, more robust solution.

And just to re-iterate an earlier point, I would not run this process within Apache. I would set up a cron job that runs at the appropriate intervals and drops the output file into a secure area on your web/file server.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top