What's the easiest way to merge (server-side) a collection of PDF documents into one big PDF document in JAVA

StackOverflow https://stackoverflow.com/questions/90350

  •  01-07-2019
  •  | 
  •  

Question

I have 3 PDF documents that are generated on the fly by a legacy library that we use, and written to disk. What's the easiest way for my JAVA server code to grab these 3 documents and turn them into one long PDF document where it's just all the pages from document #1, followed by all the pages from document #2, etc.

Ideally I would like this to happen in memory so I can return it as a stream to the client, but writing it to disk is also an option.

Was it helpful?

Solution

@J D OConal, thanks for the tip, the article you sent me was very outdated, but it did point me towards iText. I found this page that explains how to do exactly what I need: http://java-x.blogspot.com/2006/11/merge-pdf-files-with-itext.html

Thanks for the other answers, but I don't really want to have to spawn other processes if I can avoid it, and our project already has itext.jar, so I'm not adding any external dependancies

Here's the code I ended up writing:

public class PdfMergeHelper {

    /**
     * Merges the passed in PDFs, in the order that they are listed in the java.util.List.
     * Writes the resulting PDF out to the OutputStream provided.
     * 
     * Sample Usage:
     * List<InputStream> pdfs = new ArrayList<InputStream>();
     * pdfs.add(new FileInputStream("/location/of/pdf/OQS_FRSv1.5.pdf"));
     * pdfs.add(new FileInputStream("/location/of/pdf/PPFP-Contract_Genericv0.5.pdf"));
     * pdfs.add(new FileInputStream("/location/of/pdf/PPFP-Quotev0.6.pdf"));
     * FileOutputStream output = new FileOutputStream("/location/to/write/to/merge.pdf");
     * PdfMergeHelper.concatPDFs(pdfs, output, true);
     * 
     * @param streamOfPDFFiles the list of files to merge, in the order that they should be merged
     * @param outputStream the output stream to write the merged PDF to
     * @param paginate true if you want page numbers to appear at the bottom of each page, false otherwise
     */
    public static void concatPDFs(List<InputStream> streamOfPDFFiles, OutputStream outputStream, boolean paginate) {
        Document document = new Document();
        try {
            List<InputStream> pdfs = streamOfPDFFiles;
            List<PdfReader> readers = new ArrayList<PdfReader>();
            int totalPages = 0;
            Iterator<InputStream> iteratorPDFs = pdfs.iterator();

            // Create Readers for the pdfs.
            while (iteratorPDFs.hasNext()) {
                InputStream pdf = iteratorPDFs.next();
                PdfReader pdfReader = new PdfReader(pdf);
                readers.add(pdfReader);
                totalPages += pdfReader.getNumberOfPages();
            }
            // Create a writer for the outputstream
            PdfWriter writer = PdfWriter.getInstance(document, outputStream);

            document.open();
            BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
            PdfContentByte cb = writer.getDirectContent(); // Holds the PDF
            // data

            PdfImportedPage page;
            int currentPageNumber = 0;
            int pageOfCurrentReaderPDF = 0;
            Iterator<PdfReader> iteratorPDFReader = readers.iterator();

            // Loop through the PDF files and add to the output.
            while (iteratorPDFReader.hasNext()) {
                PdfReader pdfReader = iteratorPDFReader.next();

                // Create a new page in the target for each source page.
                while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
                    document.newPage();
                    pageOfCurrentReaderPDF++;
                    currentPageNumber++;
                    page = writer.getImportedPage(pdfReader, pageOfCurrentReaderPDF);
                    cb.addTemplate(page, 0, 0);

                    // Code for pagination.
                    if (paginate) {
                        cb.beginText();
                        cb.setFontAndSize(bf, 9);
                        cb.showTextAligned(PdfContentByte.ALIGN_CENTER, "" + currentPageNumber + " of " + totalPages,
                                520, 5, 0);
                        cb.endText();
                    }
                }
                pageOfCurrentReaderPDF = 0;
            }
            outputStream.flush();
            document.close();
            outputStream.close();
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (document.isOpen()) {
                document.close();
            }
            try {
                if (outputStream != null) {
                    outputStream.close();
                }
            } catch (IOException ioe) {
                ioe.printStackTrace();
            }
        }
    }
}

OTHER TIPS

I've used pdftk to great effect. It's an external application that you'll have to run from your java app.

iText seems to have changed and now has commercial licencing requirements, along with not that good help (Want documentation? Buy our book!).

We ended up finding PDFSharp http://www.pdfsharp.net/ and using that. The sample for concatenating multiple pdf documents together is simple and easy to follow: http://www.pdfsharp.net/wiki/ConcatenateDocuments-sample.ashx

Enjoy Random

Take a look at this list of Java open source PDF libraries.

Also check out this article.

[Edit: There's always Ghostscript, which is easy to use, but who wants more dependencies?]

PDFBox is by far the easiest way to achieve this, there is a utility called PDFMerger within the code which makes things very easy, all it took me was a for loop and 2 lines of code in it and all done :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top