Question

1) I saw that some persons are worked to hide data between PDF objects. They told that this method works but the big disadvantage is that acrobat reader asks to re-save the file when closing window.

I don't understand what they mean about concealing information between PDF objects. Please i need your help :)

2) I also saw that some person's are worked to conceal information after the %%EOF and are told that is not a solution because signing is not applied to the metadata, which needed a feature.

Also i don't understand what they mean about metadata in this topic ?

i refered to this link How to hide text in an PDF file?

Best regards,

Liszt.

Was it helpful?

Solution

1) I saw that some persons are worked to hide data between PDF objects. They told that this method works but the big disadvantage is that acrobat reader asks to re-save the file when closing window.

I don't understand what they mean about concealing information between PDF objects.

Normally your PDF is a sequence of PDF objects preceded by identifying numbers and a cross reference mapping those numbers to their position in the PDF:

...
2 0 obj
/WinAnsiEncoding
endobj
3 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont /Courier
/Name /F001
/Encoding 2 0 R
>>
endobj
4 0 obj
<<
/Type /Font
/Subtype /Type1
/BaseFont /Courier-Bold
/Name /F002
/Encoding 2 0 R
>>
....
xref
0 17
0000000000 65535 f
0000014476 00000 n
0000000017 00000 n
0000000052 00000 n
0000000205 00000 n
...

When PDF parsers parse an object (e.g. object 2), they usually only look up the associated value in the cross references (here in case of object 2 it's 17) and start reading the file at byte 17, first expecting the object and generation numbers (2 0) and then the tag obj; they parse everything after that tag up to the matching endobj tag and then stop. (Actually in some cases it's a bit more twisted, but this is the general idea.)

Thus, some people think it a good idea to add their secret data between the endobj of one PDF object and the object number of the next, like this:

2 0 obj
/WinAnsiEncoding
endobj
HERE ARE MY VERY SECRET VERY HIDDEN DATA, PROBABLY ENCRYPTED ETC
3 0 obj

Now some PDF readers do recognize that there are some trash bytes and offer to save the file without them.

2) I also saw that some person's are worked to conceal information after the %%EOF and are told that is not a solution because signing is not applied to the metadata, which needed a feature.

Most PDF readers ignore some trash data after the marker because in times long gone some PDF generation or transport processes left some additional trash there.

...
%%EOF
AGAIN SOME SECRET DATA

When themselves manipulating the PDFs, though, e.g. when signing them, the PDF readers may just go ahead and throw out everything that is not there according to PDF specification. Or in case of signing, they may leave the trailing bytes where they are and then integrate the signature after them. Some program expecting those extra data at the end-of-file may not find them anymore afterwards as they now are somewhere inside.

Also i don't understand what they mean about metadata in this topic ?

Some people actually use such mechanisms to add information required in later processing steps. E.g. the process creating some PDF invoice may add the address to send the PDF to and the amount to pay at the end of the file, then the PDF is processed some more, e.g. reviewed or archived, and in some final process it is send out to the addressee.

The review step might be handled differently depending on the amount added at the end; maybe sales worth more than $1000 must be cleared by special personal.

The sending process may also use the extra data after the end of file for sending the file to the recipient.

Such data about some document sometimes are called metadata.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top