Question

Files are categorized by file-extension. So my question is, how to identify the file type even the file extension has been changed.

For example, i have a video file with name myVideo.mp4, i have changed it to myVideo.txt. So if i double-click it, the preferred text editor will open the file, and won't open the exact content. But, if i play myVideo.txt in a video player, the video will be played without any problem.

I was just thinking of developing an application to determine the type of file without checking the file-extension and suggesting the software for opening the file. I would like to develop the application in Java.

Était-ce utile?

La solution 2

Structure, magic numbers, metadata, strings and regular expressions, heuristics and statistical analysis... the tool will only be as good as the database of rules behind it.

Try DROID (Digital Record Object IDentification tool) for identifying file types; Java, Net BSD-licensed. It is a free project of the National Archives UK, unrelated to Android. Source is available on Github and Sourceforge. DROID documentation is good.

See also Darwinsys file and libmagic.

Autres conseils

One of the best libraries to do this is Apache Tika. It doesn't only read the file's header, it's also capable of performing content analysis to detect the file type. Using Tika is very simple, here's an example of detecting a file's type:

import java.net.URL;
import org.apache.tika.Tika; //Including Tika

public class TestTika {

    public static void main(String[] args) {
        Tika tika = new Tika();
        String fileType = tika.detect(new URL("http://example.com/someFile.jpg"));
        System.out.println(fileType);
    }

}

There's a tool called TrID that does what you are after - it current supports 5033 different file types - and can be trained to add new types. On *nix systems, there's also the file command, which does something similar.

well, its like having a database of file-format you want to read without looking for extension in your app. Exactly as Linux does. So whenever you open a file, you need to check file-format database which type it belongs to. Though Not sure how will it work for different file types, but most of files have fixed header format, be it zip, pdf, mpg, avi, png, etc.. so this approach should work

You could try MimeUtil2, but it's quite old and though not up2date. The best way is still the file extension.

But the solution from Adam is not as bad as you think. You could build your platform independent solution using a wrapper around command line calls. I think you will get much better results using this method.

The following code snippet retrieves information about the file type

final File file = new File("file.txt");
System.out.println("File type is: " + new MimetypesFileTypeMap().getContentType(file));

Hopefully, it may help you

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top