Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP 5.3.5 fileinfo() MIME Type for MS Office 2007 files - magic.mime updates?

On a PHP upload, I'm trying to validate the MIME Type of the files being uploaded to match a valid set of MIME types for the application. When attempting to use the fileinfo() to determine the MIME type of an Office 2007 file it is NOT detecting as their appropriate MIME Types. Instead the MIME type response is "application/zip"

Office Document MIME types: http://filext.com/faq/office_mime_types.php

Example PHP Code:

$oFileInfo = new finfo( FILEINFO_MIME_TYPE );

$sMimeType = $oFileInfo -> file( $_FILES['Filedata']['tmp_name'] );

echo $sMimeType;

Server Setup Info:

  • OS: Windows Server 2003 32-bit
  • Webserver: IIS 6.0
  • PHP: 5.3.5 (Thread Safe) using FastCGI 1.5
  • File: magic.mime
    • Example by darko at uvcms dot com 16-Apr-2008 09:35
      • Link: php.net/manual/en/fileinfo.installation.php
    • Size: 517 KB
    • Source: Source Forge: GNU32 - FileType gnuwin32.sourceforge.net/packages/filetype.htm

I've found numerous posts which refer to issues with the newer Office format when downloading from a webserver. In all these examples I haven't found anywhere that illustrates a manor of adding the new MIME types to an existing magic.mime file, or a link to a magic.mime file that already contains the Microsoft Office 2007+ MIME types. Thanks for your assistance.

like image 613
Arachnid Avatar asked Nov 06 '22 03:11

Arachnid


2 Answers

Newer Office files are actually ZIP archives. That's why MIME Magic database is detecting them as ZIP files. You may need to add special rules based on file extension, or look into the ZIP file to see if it has a docProps folder (Office ZIP archives have such a folder containing meta data about the document).

There are other file formats which are actually ZIP archives with a different extension, e.g. JAR files.

like image 75
jmz Avatar answered Nov 12 '22 18:11

jmz


  1. Yes, you should update magic.mime.

lol, yeah, just update it, problem solved. Unfortunately, it looks like the magic mime type systems works off of looking at the actual file contents, and because the file is compressed, it can't uncompress (and look at which file?)

someone suggested writing a function to unzip compressed files and then checking for the existence of a "DocProps" directory, for instance. But this would introduce another vector of attack to the production server.

like image 42
targnation Avatar answered Nov 12 '22 17:11

targnation