Question

How do you go about verifying the type of an uploaded file reliably without using the extension? I'm guessing that you have to examine the header / read some of the bytes, but I really have no idea how to go about it. Im using c# and asp.net.

Thanks for any advice.


ok, so from the above links I now know that I am looking for 'ff d8 ff e0' to positively identify a .jpg file for example.

In my code I can read the first twenty bytes no problem:

                FileStream fs = File.Open(filePath, FileMode.Open);
                Byte[] b = new byte[20];
                fs.Read(b, 0, 20);

so (and please excuse my total inexperience here) but how do I check whether the byte array contains 'ff d8 ff e0'?

Was it helpful?

Solution

Here's a quick-and-dirty response to the followup question you posted:

byte[] jpg = new byte[] { 0xFF, 0xD8, 0xFF, 0xE0 };
bool match = true;
for (int i = 0; i < jpg.Length; i++)
{
    if (jpg[i] != b[i])
    {
        match = false;
        break;
    }
}

OTHER TIPS

That indeed is what the Unix file program does, with greater or lesser degrees of reliability. In part, it depends on whether the programs whose files you are trying to detect emits a file header; the program tar is notorious for not doing so. It depends on how many types of files you plan to try and recognize, but it might well be simplest to use an implementation of file; it recognizes many file types, and modern versions are extensible via a file of extra file type definitions that can handle a multitude of scenarios.

The first few bytes of a file will often tell you the file type. See, for example,
http://www.garykessler.net/library/file_sigs.html
http://www.astro.keele.ac.uk/oldusers/rno/Computing/File_magic.html

Use System.IO to read the byes as binary after the upload.

I'm curious, though, why you can't rely on on the ContentType header?

Wotsit is a good resource for finding out the magic numbers for various file types.

Reading the contents of the file is the fool proof way. Since you are building it in .Net, you could probably check the MIME Type of the uploaded file.

You can DllImport urlmon.dll to help. Please refer a post at: http://coding-passion.blogspot.com/2008/11/validating-file-type.html

And to clarify regarding Content-type, it invariably is driven by the extension of the file. So even a .zip file got its extension renamed to .txt, the content type will still say Text only.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top