Question

I need to write a bash script to tell if a file is a pdf file. However, I cannot simply use the file name or extension.

For example:

test.pdf.encrypt - will not open as the file itself is encrypted and the file is of an unknown type the computer can't recognize.

test.pdf.decrypt - will open even though the extension is .decrypt

As looking at the extension does not help and both the encrypted and decrypted file have .pdf in the middle of the name, is there a way to have the system test and see if the file is even readable with a pdf reader?

I just need the command that I can input into an if statement in bash.

if [this file is a working pdf file]; do
   echo "$file is a working pdf file."
fi
Was it helpful?

Solution 2

Every PDF file starts with %PDF. You can compare the first 4 characters of a specified file in order to make sure it's a PDF.

FILE="/Users/Tim/Documents/My File.pdf"
if [ $(head -c 4 "$FILE") = "%PDF" ]; then
    echo "It's a PDF!"
fi

OTHER TIPS

Another option would be to use file on the file:

type="$(file -b $file)"
if [ "${type%%,*}" == "PDF document" ]; then
  echo "$file is a PDF file."
fi

On Linux and Solaris the file command will identify a type of file; specifically a PDF document is one of the many types.

file filename.xxx | grep -q 'PDF' && echo 'is pdf file'  || echo 'is not pdf'

regardless of the file extension.

The file command say which is the type of a file regardless of the extension.

  $ file Confirmation.pdf 
  Confirmation.pdf: PDF document, version 1.5
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top