Debugging PDF for error

Question 1

Ok, this wasn't easy -

Due to a bug in PDFClown the my main stream of information in the PDF page has been corrupted. After it's end it had a copy of a past instance of it. This caused a partial text section without the starting command "BT" - which left a single "ET" without a "BT" in the end of the stream.

once I corrected this, it ran great.

Thank you all for your help. I would have much more difficult time debugging it without the tool RUPS which @Bruno suggested.

edit:

The bug was in the Buffer.java:clone() (line 217)

instead of line:

clone.append(data);

needs to be:

clone.append(data, 0, this.length);

Without this correction it clones the whole data buffer, and set the cloned Buffer's length to the data[].length. This is very problematic if the Buffer.length is smaller than the data[].length. The result in my case was that in the end of the stream there was garbage.

Question 2

The error shows while reading (with Adobe) the attached file only when scrolling down to the 8'th page, then scrolling back up to 3'td page. Alternatively, Zooming out to 33.3% will also produce the message.

Well, I get it easier, I merely open the PDF and scroll down using the cursor keys. As soon as the top 2 cm of page 3 appear, the message appears.

What's wrong with my file??

The content of pages 1 and 2 look ok, so let's look at the content of page 3.

My initial attributing the issue to the use of text specific operations (especially Tf and Tw) outside of a text object was wrong as Stefano Chizzolini pointed out: Some text related operations indeed are allowed outside text objects, namely the text state operations, cf. figure 9 from the PDF specification:

Graphics Objects

So while being less common, text state operations at page description level are completely ok.

After my incorrect attempt to explain the issue, the OP's own answer indicated that the

main stream of information in the PDF page has been corrupted. After it's end it had a copy of a past instance of it. This caused a partial text section without the starting command "BT" - which left a single "ET" without a "BT" in the end of the stream.

An ET without a prior BT indeed would be an error, and quite likely it would be accompanied by operations at the wrong level... Inspecting the stream content of that third page (the focused page of this issue), though, I could not find any unmatched ET. In the course of that inspection, though, I discovered that the content stream contains more than 2000 trailing 0 bytes! Adobe Reader seems not to be able to cope with these 0 bytes.

The bug the OP found, can explain the issue:

in the Buffer.java:clone() (line 217)

instead of line:
clone.append(data);
needs to be:
clone.append(data, 0, this.length);
Without this correction it clones the whole data buffer, and set the cloned Buffer's length to the data[].length. This is very problematic if the Buffer.length`` is smaller than the data[].length.

Trailing 0 bytes can be an effect of such a buffer copying bug.

Furthermore symptoms as found by the OP (After it's end it had a copy of a past instance of it) can also be the effect of such a bug. So I assume the OP found those symptoms on a different page, not page 3, but fixing the bug healed all symptoms.

How can I find what's wrong with it? is there a tool which tells you where does the error lie?

There are PDF syntax checkers, e.g. the Preflight tool included in Adobe Acrobat. but even that fails on your file.

So essentially you have to extract the page content (using a PDF browser, e.g. RUPS) and check manually with the PDF specification on the other screen.

Question 3

the general post about debugging pdf might have been also helpful as rups / pdfstreamdump etc is mentioned there How do you debug PDF files?