Question

I want to build Javascript code which checks for the file type.In the web application which I am creating allows user to upload document files viz, doc, xls, ppt, docx, xlsx, pptx, txt, rar, zip, pdf, jpg, png, gif, jpeg, odt, but it should not allow other files. I can't check just extension name in file name. As user may change it.

I tried checking content-type but it is also getting changed everytime. Suggestions are appreciated.

Was it helpful?

Solution

In "modern" browsers (IE10+, Firefox 4+, Chrome 7+, Safari 6.0.2+ etc.), you could use the File/FileReader API to read the contents of the file and parse it client-side. E.g. (example, not production code):

var fileInput = /* Your <input type="file"> element */

fileInput.addEventListener("change", function(e) {
    var file = e.currentTarget.files[0];
    var reader = new FileReader();
    reader.onload = fileLoaded;
    reader.readAsArrayBuffer(file);
});

function fileLoaded(e)
{
   var arrayBuffer = e.currentTarget.result;

   // 32 indicates that we just want a look at the first 32 bytes of the buffer.
   // If we don't specify a length, we get the entire buffer.
   var bytes = new Uint8Array(arrayBuffer, 0, 32);

   // Now we can check the content, comparing to whatever file signatures we
   // have, e.g.:

   if (bytes[0] == 0x50 &&
       bytes[1] == 0x4b &&
       bytes[2] == 0x03 &&
       bytes[3] == 0x04)
   {
      // This is most likely docx, xlsx, pptx or other zip file.
   }
}

http://jsfiddle.net/35XfG/

Note, however, that e.g. a .zip doesn't have to start with 50 4b 03 04. So, unless you spend quite a bit of time looking into different file signatures (or find some library that already did this), you're likely to be rejecting files that might actually be valid. Of course, it's also possible that it will give false positives.

False positives don't matter that much in this case, though - because this is only useful as a user friendly measure to check that the user isn't uploading files that will be rejected by the server anyway. The server should always validate what it ends up getting sent.

Of course, reading the entire file to look at the first few bytes isn't all that efficient either. :-) See Ray Nicholus' comment about that.

OTHER TIPS

Unless you can actually parse the content and by the results tell whether the file is of a certain type, I don't see a good way of doing that with pure JS. You might want to consider to upload the file to the sever temporarily, and then perform the check on the server. The unix file command is a very useful tool for that. It does not rely on file extensions, but uses the file content to analyze the file type.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top