Token not expected processing PDF using PDFsharp

https://stackoverflow.com/questions/23539812

c#
pdfsharp

17-07-2023
|

Pergunta

I have two very similar pdf files. But if i try to process the first one it throws "Token '373071' was not expected" exception but for other one I can execute the code completely. Below is my code

class Program
{
    static void Main(string[] args)
    {
        int bufferSize = 20480;
        try
        {

            byte[] byteBuffer = new byte[bufferSize];
            byteBuffer = File.ReadAllBytes(@"..\..\Fail.pdf");  
            MemoryStream coverSheetContent = new MemoryStream();

            coverSheetContent.Write(byteBuffer, 0, byteBuffer.Length);
            int t = PdfReader.TestPdfFile(coverSheetContent);
            PdfReader.Open(coverSheetContent);
        }
        catch (Exception ex)
        {
        }

    }
}

I've also added those PDF files. Well, those PDFs are row input for me I do not know where they got created or who does.

Fail.pdf

Success.pdf

There are very less information about PDFsharp please do help me to solve the problem.

Solução

The SAP tool that was used to create the PDF files adds many filling bytes after the "%%EOF" marker. PDFsharp up to version 1.32 expects the %%EOF marker within the trailing 130 bytes of the file.

You can modify the method ReadTrailer() in class Parser to search a larger area.

An implementation that searches the complete file can be found here:
http://forum.pdfsharp.net/viewtopic.php?p=583#p583

BTW: You can open the PDF like this:

var doc = PdfReader.Open(@"..\..\fail.pdf");

No need to allocate a buffer that will never be used, no stream needed.

Update: Since 2014 PDFsharp searches the complete PDF file if the "%%EOF" marker cannot be found near the end of the file. So if you are using PDFsharp 1.50 or newer it is no longer necessary to download and modify the code. Those who still use PDFsharp 1.32 or even older versions still have to modify the source.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow