Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php 7.2 finfo magic file

I have a Laravel 5 project, which let's an user download .ai (illustrator files). The issue is that Laravel detects .ai files as application/pdf.

I am detecting mime type with this function

$type = File::mimeType( $_path );

I also tried using this approach, but got the same results

$finfo = finfo_open(FILEINFO_MIME);
$mimetype = $finfo->file($_path);
finfo_close($finfo);

I figured, this has to be an issue with PHP simply not knowing what an .ai file is. I dug deeper into finfo and I understand that the default mime definitions are compiled into PHP, but I see that finfo_open, has a second argument 'magic_file', which I assume is a place that you can insert the path to a different mime definition file.

I tried using Ubuntu's /etc/magic.mime file, but finfo gave me

ErrorException: finfo_open(): Warning: offset `application\/activemessage' invalid in

error. Which I assume is because the magic.mime file is not in a correct format.

Most of the topics online create a custom PHP function or some other hack to detect mime types, but I feel as if that is not the correct solution here.

Where can I find up-to-date mime definition files and how can I load them into PHP or finfo?

My environment:

Ubuntu 16.04
PHP 7.2
like image 346
Karl Johan Vallner Avatar asked Mar 06 '23 04:03

Karl Johan Vallner


1 Answers

I figured, this has to be an issue with PHP simply not knowing what an .ai file is. I dug deeper into finfo and I understand that the default mime definitions are compiled into PHP, but I see that finfo_open, has a second argument 'magic_file', which I assume is a place that you can insert the path to a different mime definition file.

The fileinfo extension try to guess mime type by looking for certain magic sequences at specific positions within the file. The magic file is a database stored as many as magic sequences that already known.

I have a Laravel 5 project, which let's an user download .ai (illustrator files). The issue is that Laravel detects .ai files as application/pdf.

Follow up above explanation, It is ok for looking magic file to detect .ai files as pdf. Because Adobe Illustrator Artwork is a file that can saved as either EPS or PDF format.

I did some research hardly for distinguish between general pdf file and ai file saved in pdf format. First of all, I have downloaded free ai files from internet, Identify files by looking magic number with both command hexdump and file.

$ hexdump -C 7_full_ai_vi_template_vector_8.ai | head
00000000  25 50 44 46 2d 31 2e 34  0d 25 e2 e3 cf d3 0d 0a  |%PDF-1.4.%......|
00000010  31 20 30 20 6f 62 6a 0d  3c 3c 20 0d 2f 54 79 70  |1 0 obj.<< ./Typ|
00000020  65 20 2f 43 61 74 61 6c  6f 67 20 0d 2f 50 61 67  |e /Catalog ./Pag|
00000030  65 73 20 32 20 30 20 52  20 0d 2f 4d 65 74 61 64  |es 2 0 R ./Metad|
00000040  61 74 61 20 38 38 20 30  20 52 20 0d 3e 3e 20 0d  |ata 88 0 R .>> .|
00000050  65 6e 64 6f 62 6a 0d 32  20 30 20 6f 62 6a 0d 3c  |endobj.2 0 obj.<|
00000060  3c 20 0d 2f 54 79 70 65  20 2f 50 61 67 65 73 20  |< ./Type /Pages |
00000070  0d 2f 4b 69 64 73 20 5b  20 35 20 30 20 52 20 5d  |./Kids [ 5 0 R ]|
00000080  20 0d 2f 43 6f 75 6e 74  20 31 20 0d 3e 3e 20 0d  | ./Count 1 .>> .|
00000090  65 6e 64 6f 62 6a 0d 33  20 30 20 6f 62 6a 0d 3c  |endobj.3 0 obj.<|
$ file 7_full_ai_vi_template_vector_8.ai
7_full_ai_vi_template_vector_8.ai: PDF document, version 1.4

Look at first few bytes of file, As you can see, It's PDF file with .ai extension.

After that, I opened it with Preview on my Mac, It knows this file is created by Adobe Illustrator in Inspector Dialog. So It must have some way to figure out AI file saved in PDF format.

Inspector Dialog

Most of the topics online create a custom PHP function or some other hack to detect mime types, but I feel as if that is not the correct solution here.

Where can I find up-to-date mime definition files and how can I load them into PHP or finfo?

I googled for solution with no luck, So I created it myself, In page 15 of Adobe Illustrator File Format Specification, it says:

The %%Creator comment identifies the application that generated the PostScript language document. The version number (version 6.0 in Figure 1) is arbitrary text, terminated by a newline character.

I assume files that contains pdf magic bytes and string %%Creator Adobe Illustrator in beginning of file should identified as .ai.

Let's do it by writing some magic rule:

$ cat ai
0       string          %PDF-           PDF document
!:mime  application/pdf
>5      byte            x               \b, version %c
>7      byte            x               \b.%c
>7      search/1000     %%Creator:\ Adobe\ Illustrator  Adobe Illustrator Document

PHP scripts use custom magic file for .ai

$ cat fileinfo.php
<?php

$magic_file = __DIR__ . '/ai';

$finfo = new finfo(FILEINFO_NONE, $magic_file);
echo $finfo->file($argv[1]) . PHP_EOL;

Will output

$ php fileinfo.php ./7_full_ai_vi_template_vector_8.ai
PDF document, version 1.4 Adobe Illustrator Document

It works, But I think it isn't good idea to maintain your own magic file. Maybe you can write simple function for it, Detect .ai after $type = File::mimeType( $_path ); says it is pdf file.

like image 65
Gasol Avatar answered Mar 14 '23 21:03

Gasol