Question

To summarize, I'm using Itextsharp to import specific pages from a PDF, possibly rotate, resize or otherwise alter that page, and exporting it into a new PDF. To this end I'm using the PDFWriter class from Itext. The problem I'm running into is that when using the writer class, it doesn't appear to be including the visible annotations that appear on the source page (in my case it's a comment) in the output page. Interestingly it does include the embedded OCR without issue.

Additionally, when using the Itext PDFcopy class it does work just as I want it to (copying comments properly to source), unfortunately PDFcopy doesn't have a lot of easily accessible functionality for the sorts of things I need to do with the page (e.g. resizing pages).

So I am looking for one of two solutions:

-Preferably I'd like to continue to use the writer class, but need it to copy/duplicate any visible annotations etc. from the source pages and include them in the output.

-Need some example code of using the PDFCopy class to resize a page. There is a pdfcopy.Setpagesize function (which doesn't work, which I suspect means I'm doing it wrong), but I have basically no idea how to properly scale the source page when it needs to be resized.

Here's some pseudo code to give you an idea of where I'm at in terms of using the PDFWriter class:

      '...

Dim MS As New MemoryStream()
Dim document As New Document
Dim WriterPDF As PdfWriter = PdfWriter.GetInstance(document, MS)
Dim reader As PdfReader = Nothing
Dim cb As PdfContentByte = WriterPDF.DirectContent
reader = New PdfReader(New MemoryStream(File.ReadAllBytes(FilePathList.Item(ItemNum))))

For Each PageItem As Integer In PageNumList
   Dim page As PdfImportedPage = WriterPDF.GetImportedPage(reader, PageItem)
   Dim PageSizeStandard As Rectangle = PageSize.LETTER
  document.SetPageSize(PageSizeStandard)

   Dim tm = New System.Drawing.Drawing2D.Matrix()

        'code to resize, rotate etc... tm.scale, tm.rotate, etc.

   cb.AddTemplate(page, tm)
   document.NewPage()

next

As the alternative, the PDFCopy code I'm using for rotation involves:

 Dim MasterCopy As PdfCopy = New PdfCopy(document, New FileStream(outputPath, FileMode.Create))
      '...
 Dim PageDict As PdfDictionary = reader.GetPageN(PageItem)
' can get current rotation with this... 
' Dim Rot As PdfNumber = PageDict.Get(PdfName.ROTATE)

      '...
Dim RotatedPageSizeHeight As Single = reader.GetPageSizeWithRotation(PageItem).Height
Dim RotatedPageSizeWidth As Single = reader.GetPageSizeWithRotation(PageItem).Width

If RotatedPageSizeWidth > RotatedPageSizeHeight Then
     PageDict.Put(PdfName.ROTATE, New PdfNumber(90))
     'there is a Pdfname.Size, but no idea if that's even what I need or how to use it.
     'with the writer class I use a matrix to scale the page, works fine. 
End If

      '... 
MasterCopy.AddPage(page)

Sorry the pseudo code is a bit fragmented, trying to keep it brief. Please let me know if I can provide any additional information. And thanks in advance!

Was it helpful?

Solution

I've created a small sample that implements what mkl suggests: ScaleRotate.

PdfReader reader = new PdfReader(src);
int n = reader.getNumberOfPages();
PdfDictionary page;
for (int p = 1; p <= n; p++) {
    page = reader.getPageN(p);
    if (page.getAsNumber(PdfName.USERUNIT) == null)
        page.put(PdfName.USERUNIT, new PdfNumber(2.5f));
    page.remove(PdfName.ROTATE);
}
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
stamper.close();
reader.close();

Given an original PDF pages.pdf with rotated pages, cropped pages and annotations. We scale and rotate some pages, resulting in pages_altered.pdf.

We introduce a UserUnit of 2.5 for all pages that aren't scaled yet. If you change the UserUnit to 0.5, you'll see that it won't have any effect in Adobe Reader. The ISO standard for PDF says that the range that can be used for the user unit is implementation-independent. Version 1.7 of the PDF specification originally written by Adobe says: "Acrobat 7.0 supports a maximum UserUnit value of 75,000." Nothing is said about the minimum value, but experience tells us that the minimum value supported by Adobe Reader is 1, meaning you can't scale down.

As for the rotation, you can change the rotation of a page by changing the /Rotate key in the page dictionary. In the example, I removed the key, changing all pages shown in landscape (of which the value for /Rotate is 90) into portrait (the default value for /Rotate is 0). You'll notice that this doesn't have any effect on page 4. Page 4 isn't rotated. It looks like a page in landscape because the dimensions of the page are created in such a way that the width is greater than the height.

Summarized: it's a piece of cake to rotate pages in an existing PDF, so is scaling the pages up to a bigger size. If you want to downscale pages, you can only use PdfWriter (which throws away all annotations) and you need to copy the annotations separately after transforming all the /Rect values of these annotations. This is a huge task. It took one of our customers several weeks to achieve this correctly. Be prepared to spend an equal amount of time if that's what you want.

DISCLAIMER: the UserUnit value isn't supported by all viewers. Implementations may vary depending on the viewer that is used. The feature was introduced in PDF 1.6, meaning that the functionality won't work in any viewer supporting only older PDF versions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top