I'd like to be able to read in the first couple kilobytes of unknown file types and see if it matches any known file types (i.e. mp3 file, jpeg, etc...). I was thinking of trying to load meta data from files from libraries like PIL, sndhdr, py264, etc... and see if they picked up any valid formats but I thought this must have been a problem someone has solved before.

Is there one library or a gist showing the usage of multiple libraries which would do this?

有帮助吗?

解决方案

Use python-magic to do the fingerprinting.

The library can determine file type from bytes data only:

import magic
magic.from_buffer(start_data_from_something)

The library provides access to the libmagic file type identification library, which also drives the UNIX file command.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top