문제

I've got a report that can generate over 30,000 records if given a large enough date range. From the HTML side of things, a resultset this large is not a problem since I implement a pagination system that limits the viewable results to 100 at a given time.

My real problem occurs once the user presses the "Get PDF" button. When this happens, I essentially re-run the portion of the report that prints the data (the results of the report itself are stored in a 'save' table so there's no need to re-run the data-gathering logic), and store the results in a variable called $html. Keep in mind that this variable now contains 30,000 records of data plus the HTML needed to format it correctly on the PDF. Once I've got this HTML string created, I pass it to TCPDF to try and generate the PDF file for the user. However, instead of generating the PDF file, it just craps out without an error message (the 'Generating PDf...') dialog disappears and the system acts like you never asked it to do anything.

Through tests, I've discovered that the problem lies in the size of the $html variable being passed in. If the report under 3K records, it works fine. If it's over that, the HTML side of the report will print but not the PDF.

Helpful Info

  • PHP 5.3
  • TCPDF for PDF generation (also tried PS2PDF)
  • Script Memory Limit: 500 MB

How would you guys handle this scale of data when generating a PDF of this size?

도움이 되었습니까?

해결책

TCPDF seems to be a native implementation of PDF generation in PHP. You may have better performance using a compiled library like PDFlib or a command-line app like htmldoc. The latter will have the best chances of generating a large PDF.

Also, are you breaking the output PDF into multiple pages? I.e. does TCPDF know to take a single HTML document and cut it into multiple pages, or are you generating multiple HTML files for it to combine into a single PDF document? That may also help.

다른 팁

Here is how I solved this issue: I noticed that some of the strings that I was having in my HTML output had some slight encoding issues - I ran htmlentities on those particular strings as I was querying the database for them and that cleared the problem.

Don't know if this was what was causing your problem, but my experience was very similar - when I was trying to output an HTML table that had a large size, with about 80.000 rows, TCPDF would display the page header but nothing table-related. This behaviour would be the same with different sets of data and different table structures.

After many attempts I started adding my own pagination - every 15 table rows, I would break the page and add a new table to the following page. That's when I noticed that every once and a while I would get blank pages between a lot of full and correct ones. That's when I realised that there must be a problem with those particular subsets of data, and discovered the encoding issue. It may be that you had something similar and TCPDF was not making it clear what your problem was.

Are you using the writeHTML method?

I went through the performance recommendations here: http://www.tcpdf.org/performances.php

It says "Split large HTML blocks in smaller pieces;".

I found that if my blocks of HTML went over 20,000 characters the PDF would take well over 2 minutes to generate.

I simply split my html up into the blocks and called writeHTML for each block and it improved dramatically. A file that wouldn't generate in 2 minutes before now takes 16 seconds.

I would break the PDF into parts, just like pagination.

1) Have "Get PDF" button on every paginated HTML page and allow downloading of records from that HTML page only.

2) Limit the maximum number of records that can be downloaded. If the maximum limit reaches, split the PDF and let the user to download multiple PDFs.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top