Using statement variable being reused with PDFSharp

https://stackoverflow.com/questions/20311217

06-08-2022
|

Pergunta

The following code uses PDFSharp to split out pages of pdf documents into pages that are smaller than A4 and pages that are larger than A3:

''' <summary>
''' Process the list of pdfs
''' </summary>
Public Sub ProcessPdfs()

    Dim tempPath As String

    ' Code omitted

    ' Generate a temporary path in case pdfs need to be saved
    If String.IsNullOrEmpty(Me.tempFolder) OrElse Not Directory.Exists(Me.tempFolder) Then
        tempFolder = Path.GetTempPath()
    End If
    tempPath = Path.Combine(Me.tempFolder, Path.GetRandomFileName + ".pdf")

    ' Loop through the pages of the pdfs and process each page in turn. Processing involves
    ' determining the size of the page, then shrinking, adding the footer and then adding to
    ' the appropriate output pdf
    For Each referenceNumber As String In Me.Pdfs.Keys
        For Each pdf As PdfDocument In Me.Pdfs(referenceNumber)
            ' Save the pdf to disk for PDFSharp to be able to read it properly
            If String.IsNullOrEmpty(pdf.FullPath) Then
                pdf.Save(tempPath)
                pdf = PdfReader.Open(tempPath)
            End If
            For Each page As PdfPage In pdf.Pages

                ' Code omitted

                Select Case pageArea
                    Case Is <= a4PageArea
                        Call AddPage(referenceNumber, pdf, page, PageSize.A4)
                    Case Else
                        Call AddPage(referenceNumber, pdf, page, PageSize.A3)
                End Select
            Next
        Next
    Next

    ' Code omitted

    ' Delete temporary pdfs if there are any
    If File.Exists(tempPath) Then
        File.Delete(tempPath)
    End If

End Sub

''' <summary>
''' Add the specified page to the specified output document
''' </summary>
''' <returns>The page which was added to the output pdf</returns>
Private Function AddPage(ByVal ReferenceNumber As String, ByVal ParentPdf As PdfDocument, ByVal ParentPdfPage As PdfPage, ByVal PageSize As PageSize) As PdfPage

    ' Code omitted

    ' Copy the specified page onto thew newly created page
    Using parentForm As XPdfForm = XPdfForm.FromFile(ParentPdf.FullPath)
        parentForm.PageIndex = ParentPdf.Pages.Cast(Of PdfPage)().ToList().IndexOf(ParentPdfPage)
        scaleFactor = 1
        ' Create PdfSharp graphics object with which to write onto the page
        Using graphics As XGraphics = XGraphics.FromPdfPage(outputPdfPage)
            graphics.SmoothingMode = XSmoothingMode.HighQuality

            ' Code omitted

            ' Draw the page
            Call graphics.DrawImage(parentForm, targetRect)
        End Using
    End Using

    Return outputPdfPage

End Function

What this does is take a pdf, read esch page and then scale it so that it fits the size of the page onto which it is to be printed.

PDFSharp has trouble opening documents which were created in Adobe v6, so I use iTextSharp to rebuild the pdf in a version that PDFSharp can open. These PDFs are rebuilt in memory, and for some reason they need to be written to disk for the PDFSharp to process them correcly.

In ProcessPdfs() I check if the pdf has a physical path and if not I save it at a temp location.

The problem I found is that AddPage() seems to continuously work with the same pdf. I checked the temporary pdf files created on disk and they are correct, i.e. different each time.

But the file loaded in the first using statement by XPdfForm.FromFile(ParentPdf.FullPath) never changes. It's as if the code realises that the file path does not change and so decides not to reload the file.

I thought that using a using statement would ensure that the variable would be disposed of at the end and therefore the file would be reloaded anew every time. Am I misunderstanding? Or what is happening here?

Incidentally I worked around this by saving each pdf file under a different file name. Which is why I think that the variable from the using block is being reused every time based on the file name...

Solução

The XPdfForm caches the documents internally - and the filename is the key. If you re-use the filename for a new document, the old, cached document will be used.

The cache is thread-local.

So it's not a bug, it's a feature.

It should be possible to use streams instead of files.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow