Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JSON file generated by PHP has application/octet-stream mime type

I have a script that generates a JSON file from data. I have a second script that read files from a directory to take only JSON ones and insert them in DB.

The problem is that the second script detects "application/octet-stream" MIME type from my generated files instead of application/json

I don't want to allow application/octet-stream MIME type as it can be pretty anything (for security reason: that second script load all json file in the directory (not only the generated ones)).

Is there then anyway to "set" a MIME type for a file?

The code that generate the file :

if($r_handle = fopen($s_file_name, 'w+')){
    fwrite($r_handle, json_encode($o_datas, JSON_HEX_QUOT | JSON_HEX_TAG));
    fclose($r_handle);
    return;
}

The code that read JSON files :

$o_finfo = finfo_open(FILEINFO_MIME_TYPE);
$a_mimes =& get_mimes();
if(is_dir($s_dir) && $r_handle = opendir($s_dir)){
    while($s_file = readdir($r_handle)){
        $s_file_path = $s_dir.$s_file;
        $s_mime      = finfo_file($o_finfo, $s_file_path);
        if(!in_array($s_file, array('.', '..')) && in_array($s_mime, $a_mimes['json'])){
            // Some code
        }
    }
}
like image 240
VeZoul Avatar asked Nov 08 '22 03:11

VeZoul


1 Answers

The fileinfo extension (as similar tools like the file Unix command) basically searches for signatures defined in a database (called "magic"). If I'm not wrong, PHP's magic database is currently compiled into the extension binary file so you can't peek at it but you'll probably have a similar database in your system. I have Apache's at C:\Apache24\conf.magic and this is the entry for JPEG:

# JPEG images
0   beshort     0xffd8      image/jpeg

Anything that starts with 0xffd8 is a picture. Done!

JPEG file in hex editor

I'm not particularly familiar with the format but it doesn't seem to even look for JSON. And, as you may already be guessing, the overall utility is by no means a security feature. It's just a helper tool to figure out what a file may contain. It's very handy if e.g. you've recovered files with no extension from a damaged disk.


MIME types are cool. You set application/json and everybody knows it's JSON. Straightforward and simple, isn't it. There're only two caveats:

  • File systems (many of them actually invented before MIME types) store many file attributes (name, last modification date, permissions, sometimes even icons...) but not MIME types. (Sure, there's probably some academic file system that does, but it's not the case of FAT32, NTFS, ext4...). It doesn't normally add valuable information, it's yet another token to keep updated and it's particularly non-portable (copy your files to a thumb drive and they're gone).

  • It's still not a security feature. If I can forge the file contents, what prevents me from forging the MIME type?


So, what can you do? The best alternative is: nothing at all.

Just parse the file as JSON and detect whether it failed. You need to do it anyway and it tells you everything you need to do. JSON is just plain text data. Maybe add some checks to prevent very large files (again, you should be doing it anyway in your file upload) and add a $depth check but that's all.

if (json_decode($s_file_path, true, 32)!==null || json_last_error()!==JSON_ERROR_NONE) {
    // Valid JSON
}
like image 72
Álvaro González Avatar answered Nov 15 '22 05:11

Álvaro González