Merging 1000 PDF thru iText throws java.lang.OutOfMemoryError: Java heap space

https://stackoverflow.com/questions/1260895

12-09-2019
|

Question

I am trying to merge 1000 PDF Files thru iText. I am not sure where the memory leakage is happening. Below is the sample code. Note that i am removing the child-pdf file as soon as i merge to the parent file. Please pointout bug in below code or is there any better way to do this with out memory conception. This process is done thru servlet (not standalone program)

FileInputStream local_fis = null;
BufferedInputStream local_bis = null;
File localFileObj = null;
for(int taIdx=0;taIdx<totalSize;taIdx++){
    frObj = (Form3AReportObject)reportRows.get(taIdx);
    localfilename = companyId + "_" +  frObj.empNumber + ".pdf";

    local_fis = new FileInputStream(localfilename);
    local_bis = new BufferedInputStream(local_fis); 
    pdfReader = new PdfReader(local_bis);

    cb = pdfWriter.getDirectContent(); 
    document.newPage();
    page = pdfWriter.getImportedPage(pdfReader, 1);
    cb.addTemplate(page, 0, 0);
    local_bis.close();
    local_fis.close();

    localFileObj = new File(localfilename);
    localFileObj.delete();
}
document.close();

Solution

You might want to try something like the following (exception handling, file close and delete removed for clarity):

for(int taIdx = 0; taIdx < totalSize; taIdx++) {
    Form3AReportObject frObj = (Form3AReportObject)reportRows.get(taIdx);

    localfilename = companyId + "_" +  frObj.empNumber + ".pdf";

    FileInputStream local_fis = new FileInputStream(localfilename);

    pdfWriter.freeReader(new PdfReader(local_fis));

    pdfWriter.flush();
}

pdfWriter.close();

OTHER TIPS

Who says there is a memory leak? Your merged document needs to fit into memory in its entirety, there's no way around it, and it may well be larger than the default heap size of 64MB in memory (rather than on disc).

I don't see a problem with your code, but if you want to diagnose it in detail, use visualvm's heap profiler (comes with the JDK since Java 6 update 10 or so).

Have you tried increasing max heap size from the default (which is only 64mb)?

See:

What if you don't use InputStream? If you can, try using just the path for you file on 'new PDFReader("/somedirectory/file.").

This make the reader to act on the disk.

The above code is trying to create a PdfContentByte object (cb) within the loop. Moving it outside might fix the issue. I used similar code in my application to stitch together 13k individual PDFs in to one PDF without trouble.

public class PdfUtils {
     public static void concatFiles(File file1, File file2, File fileOutput) throws Exception {
          List<File> islist =  new ArrayList<File>();
          islist.add(file1);
          islist.add(file2);

          concatFiles(islist, fileOutput);
         }

         public static void concatFiles(List<File> filelist, File fileOutput) throws Exception {
          if (filelist.size() > 0) {
                 PdfReader reader = new PdfReader(new FileInputStream( filelist.get(0)) );
                 Document document = new Document(reader.getPageSizeWithRotation(1));

           PdfCopy cp = new PdfCopy(document,  new FileOutputStream( fileOutput ));

           document.open();


           for (File file : filelist ) {

                PdfReader r = new PdfReader( new FileInputStream( file));
                for (int k = 1; k <= r.getNumberOfPages(); ++k) {
                    cp.addPage(cp.getImportedPage(r, k));
                }
                cp.freeReader(r);

           }
           cp.close();
           document.close();
          } else{             
           throw new Exception("La lista dei pdf da concatenare è vuota");        
          }               
         }
   }

Instead of merging 1000 PDFs, try to create a zip of them.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow