Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mime type missing for .rar and .tar

Javascript (Windows 8.1, Firefox) doesn't seem to have mime types for .tar files or .rar files (and perhaps others; these are the only two I've found). What's up with that? Is there anything I can use to resolve this? I'd really like to be able to retrieve mime types for these file types without doing some weird extension hacking.

I made a fiddle to prove the issue: http://jsfiddle.net/kungfujoe/jd8h7wvs/

If you browse to a .txt, or a .docx, or so many other formats, the type is pulled successfully. However, both .tar and .rar don't pull them. Odd, right?

(JSFiddle code below)

HTML

<input id='button' type='file' name='file'/>
<div id='out'>Output Goes Here</div>

Javascript (using JQuery 2.1.0)

$('#button').unbind('change');
$('#button').bind('change', function () {
    if(this.files[0] !== undefined && this.files[0] !== null) {
        document.getElementById("out").innerHTML = "Type is " + this.files[0].type.toString();        
    } else {
        throw "Error"
    }
});

Thanks

EDIT

1) Updated question to reflect that the issue has been observed on Windows 8.1 Firefox. Chrome has a mime type for tar files, but not rar files.

2) Added jQuery to the Fiddle

like image 275
Cody S Avatar asked Oct 01 '14 19:10

Cody S


1 Answers

JQuery just wraps the underlying File API used in most browsers, so there is no difference how JQuery and Javascript handle files and mime types. Here is the File API spec:

http://www.w3.org/TR/FileAPI/#dfn-type

The File object that you are manipulating inherits the type property from the Blob object, and the browser uses the blob (byte array) to determine the mime type.

To accomplish that task each browser implements a file sniffing algorithm to "read" the mime type from the byte array, and if the mime type doesn't match, it will return an empty string like in your scenario above.

Here is the full algorithm spec:

https://mimesniff.spec.whatwg.org/

So now you are wondering why it doesn't work for TAR, ZIP and RAR files, and why does it work for some people and not for you?.. because the file sniffing algorithm is evidently not perfect.

It uses byte pattern matching, and that seems not reliable enough.

For example i have used WinRaR on my windows 8 box to compress a file, and the initial bytes of the created file are:

52 61 72 21 1A 07 00

However, to recognize it as .RAR the browser byte pattern matching algorithm expects

52 61 72 20 1A 07 00

As you see there is a slight difference, and when i uploaded my RAR file to the browser using your code above, Firefox wasn't able to recognize the Mime-Type, and i got an empty string in the type property.

However, when i packed a ZIP file using WinRar on the same machine with default settings it generates an initial byte array sequence of 50 4B 03 04 that matched with the zip byte pattern expected by the algorithm, and when i used your code above it was able to detect the mime type correctly as application/zip!

So as you see from my explanation, it is a matter of serialization, and the "imperfection" of the algorithm that matches the serialized bytes with mime extensions in the browsers.

Based on everything mentioned above, i would recommend NOT relying on the mime sniffing, and instead use your custom code to determine the mime type OR existing libraries. You can use a server-side or a client-side approach.

If you want to stick to the client you could use the following JS library:

https://github.com/rsdoiel/mimetype-js

And then discovering the mime type would be a matter of one line of code:

mimetype.lookup("myfile.rar")

Here is a working Fiddle, upgrading your example to use mimetype js:

http://jsfiddle.net/jd8h7wvs/4/

like image 114
Faris Zacina Avatar answered Sep 28 '22 04:09

Faris Zacina