Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the mime type of a file in python?

Tags:

python

mime

People also ask

How do I find the file MIME type?

For detecting MIME-types, use the aptly named "mimetype" command. It has a number of options for formatting the output, it even has an option for backward compatibility to "file". But most of all, it accepts input not only as file, but also via stdin/pipe, so you can avoid temporary files when processing streams.

What is MIME type of python file?

Multipurpose Internet Mail Extensions (MIME) type is a string that is used to identify a type of data that a particular file contains. It's generally used to represents the type of file on the Internet (usually in mails) so that software handling the data can understand how to handle it.

Where is MIME type stored in a file?

All MIME type information is stored in a database. The MIME database is located in the directory /usr/share/mime/ . The MIME database contains a large number of common MIME types, stored in the file /usr/share/mime/packages/freedesktop.

How do I know the file type in Python?

The general way of recognizing the type of file is by looking at its extension. But this isn't generally the case. This type of standard for recognizing file by associating an extension with a file type is enforced by some operating system families (predominantly Windows).


The python-magic method suggested by toivotuo is outdated. Python-magic's current trunk is at Github and based on the readme there, finding the MIME-type, is done like this.

# For MIME types
import magic
mime = magic.Magic(mime=True)
mime.from_file("testdata/test.pdf") # 'application/pdf'

The mimetypes module in the standard library will determine/guess the MIME type from a file extension.

If users are uploading files the HTTP post will contain the MIME type of the file alongside the data. For example, Django makes this data available as an attribute of the UploadedFile object.


More reliable way than to use the mimetypes library would be to use the python-magic package.

import magic
m = magic.open(magic.MAGIC_MIME)
m.load()
m.file("/tmp/document.pdf")

This would be equivalent to using file(1).

On Django one could also make sure that the MIME type matches that of UploadedFile.content_type.


This seems to be very easy

>>> from mimetypes import MimeTypes
>>> import urllib 
>>> mime = MimeTypes()
>>> url = urllib.pathname2url('Upload.xml')
>>> mime_type = mime.guess_type(url)
>>> print mime_type
('application/xml', None)

Please refer Old Post

Update - In python 3+ version, it's more convenient now:

import mimetypes
print(mimetypes.guess_type("sample.html"))

2017 Update

No need to go to github, it is on PyPi under a different name:

pip3 install --user python-magic
# or:
sudo apt install python3-magic  # Ubuntu distro package

The code can be simplified as well:

>>> import magic

>>> magic.from_file('/tmp/img_3304.jpg', mime=True)
'image/jpeg'

Python bindings to libmagic

All the different answers on this topic are very confusing, so I’m hoping to give a bit more clarity with this overview of the different bindings of libmagic. Previously mammadori gave a short answer listing the available option.

libmagic

  • module name: magic
  • pypi: file-magic
  • source: https://github.com/file/file/tree/master/python

When determining a files mime-type, the tool of choice is simply called file and its back-end is called libmagic. (See the Project home page.) The project is developed in a private cvs-repository, but there is a read-only git mirror on github.

Now this tool, which you will need if you want to use any of the libmagic bindings with python, already comes with its own python bindings called file-magic. There is not much dedicated documentation for them, but you can always have a look at the man page of the c-library: man libmagic. The basic usage is described in the readme file:

import magic

detected = magic.detect_from_filename('magic.py')
print 'Detected MIME type: {}'.format(detected.mime_type)
print 'Detected encoding: {}'.format(detected.encoding)
print 'Detected file type name: {}'.format(detected.name)

Apart from this, you can also use the library by creating a Magic object using magic.open(flags) as shown in the example file.

Both toivotuo and ewr2san use these file-magic bindings included in the file tool. They mistakenly assume, they are using the python-magic package. This seems to indicate, that if both file and python-magic are installed, the python module magic refers to the former one.

python-magic

  • module name: magic
  • pypi: python-magic
  • source: https://github.com/ahupp/python-magic

This is the library that Simon Zimmermann talks about in his answer and which is also employed by Claude COULOMBE as well as Gringo Suave.

filemagic

  • module name: magic
  • pypi: filemagic
  • source: https://github.com/aliles/filemagic

Note: This project was last updated in 2013!

Due to being based on the same c-api, this library has some similarity with file-magic included in libmagic. It is only mentioned by mammadori and no other answer employs it.


There are 3 different libraries that wraps libmagic.

2 of them are available on pypi (so pip install will work):

  • filemagic
  • python-magic

And another, similar to python-magic is available directly in the latest libmagic sources, and it is the one you probably have in your linux distribution.

In Debian the package python-magic is about this one and it is used as toivotuo said and it is not obsoleted as Simon Zimmermann said (IMHO).

It seems to me another take (by the original author of libmagic).

Too bad is not available directly on pypi.