Using Preview on Mac, why does simply saving a PDF over itself with no changes made completely change the file's contents?

StackOverflow https://stackoverflow.com/questions/12519148

Question

I have a 3 page PDF file open in Preview on the Mac. If I make no changes to the file, hit cmd-s and save the file, the binary content of the file changes heavily. Why is this?

I can tell this is the case because of my process:

  1. make a duplicate copy of a pdf (cp a.pdf b.pdf)

  2. vimdiff a.pdf b.pdf (no changes, exactly the same content)

  3. open a.pdf in Preview (make no edits)

  4. vimdiff a.pdf b.pdf (no changes, exactly the same content)

  5. hit cmd-s (save pdf)

  6. vimdiff a.pdf b.pdf (tons of changes, well beyond the pdf's meta-data)

Can anyone explain why/how a PDF gets "re-written" even though no changes were made?

Was it helpful?

Solution

Indeed, Preview heavily re-writes any PDF that was initially created by any non-Quartz application upon saving it for the first time.

I earn (part of) my living with debugging PDFs.

And I made it a habit now to never-ever respond to a customer with suggestions how to fix any (even the most simple) reported problem as soon as I discover the provided sample PDF has been touched by Quartz (fortunately, Apple admits its involvement by updating the /Producer metadata key with Mac OS X 10.7.4 Quartz PDFContext or similar):

  • because I never know if this PDF was the original PDF that exhibited the described problem, or if the customer was just trying to mail the original problem PDF via his MacBook and unintentionally re-saved + re-wrote the PDF when operating his mail app.

  • Therefor I always need to first establish a procedure with Apple customers which guarantees I get to analyze the original PDF files exhibiting a particular bug or problem, not the ones which where spoiled by Quartz/Preview. I've mis-spend quite some man days of work on 'analyzing' the wrong files before I discovered the problem about a year ago....

A lot of True Believers of the Apple Cult are not aware of this behavior, and a lot of prepress Pros are also completely clueless about it.

When saving a PDF the second time, chances are, that only the /ModDate metadata key is s updated (unless you're using a new version of Quartz on your Mac)... but you never know until you take a really really close look at the PDFs in question.


Update (with some additional info)

BTW, for me the simple hit on [cmd]+[s] does not yet change the PDF. But I'm on Mac OS X Lion 10.7.4, with Preview.app Version 552 (719.23). On Lion the change is triggered by saving the file under a new name (Duplicate => Save...).

k00k seems to be on Mac OS X Mountain Lion 10.8.1, with Preview.app Version 6.00 (765). For him a simple hit on [cmd]+[s] suffices to trigger the change.


(I'm not saying that the changes Apple makes to Preview-ed PDF files are necessarily bad. In quite some cases this may silently 'repair' damaged files and could be argued to be 'user-friendly' behavior. -- What I'm saying is that there are changes (whether for the good or for the bad is irrelevant) which go beyond metadata-stamping a new /ModDate value into the file, and which can make troubleshooting PDF problems very painful...)

OTHER TIPS

I do not have the source code of the Preview application, so I cannot say for sure. I can guess that they are not just saving the same data that was loaded, instead it seems they are re-constructing an "equivalent" PDF file.

Additionally, when a PDF file is "re-created", there are a few items inside that will always be different (unique IDs, last date/time modification, etc).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top