Question

If someone sends me a document (.pdf,.doc,.xls, ppt, .ogg, mp3, png, etc) without the extension, how can I determine the file type? The /usr/bin/file command doesn't always guess right or it simply says that I have a Microsoft Office document. I would like to know exactly so I can add the extension to the file name.

Was it helpful?

Solution

You can come up with your own rules by adding them to /etc/magic

man file for more details. It is tricky to always get these correct however, I have had reasonable success.

OTHER TIPS

Try mimetype(1).

For Perl, look at File::MimeInfo.

Some of the other posters thus far appear to neglect a few things.

File::MimeInfo uses the same MimeInfo database used by 'file' to identify files. So That's unlikely to do anything different.

File::Type is likely to be interesting though, as it relies only on itself, but this leads to a comically long script full of 'if' statements. But this is, by its very nature, unlikely to cover things 'file' already doesn't cover.

The best you can do with unknown filetypes is try cracking them open with a hex-editor, or running them through 'strings' and seeing if you recognise anything. If you manage how to Identify a file, you may wish to go for File::Type as your solution because as far as I can make out, its at least easy to extend.

You can use the Perl module: File::Type

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top