Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the MIME Type of a .MSG file?

Tags:

I have tried these ways of finding the MIME type of a file...

Path source = Paths                 .get("C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg");         System.out.println(Files.probeContentType(source)); 

The above code returns null...
And if I use the TIKA API from Apache to get the MIME type then it gives it as text/plain...

But I want the result as application/vnd.ms-outlook

UPDATE

I also used MIME-Util.jar as follows with code...

MimeUtil2 mimeUtil = new MimeUtil2();         mimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");         RandomAccessFile file1 = new RandomAccessFile(                 "C://Users/akash/Desktop/FW Internal release of MSTClient-Server5.02.04_24.msg",                 "r");         System.out.println(file1.length());         byte[] file = new byte[624128];         file1.read(file, 0, 624128);         String mimeType = MimeUtil2.getMostSpecificMimeType(mimeUtil.getMimeTypes(file)).toString(); 

This gives me output as application/msword

UPDATE:

Tika API is out of scope as it is too large to include in the project...

So how can I find the MIME type?

like image 370
CoderNeji Avatar asked Jun 26 '15 10:06

CoderNeji


People also ask

What MIME type is MSG files?

msg files will have a mimetype of application/vnd.

How do I find the file MIME type?

For detecting MIME-types, use the aptly named "mimetype" command. It has a number of options for formatting the output, it even has an option for backward compatibility to "file". But most of all, it accepts input not only as file, but also via stdin/pipe, so you can avoid temporary files when processing streams.

Where is MIME type stored in a file?

All MIME type information is stored in a database. The MIME database is located in the directory /usr/share/mime/ . The MIME database contains a large number of common MIME types, stored in the file /usr/share/mime/packages/freedesktop.


1 Answers

I tried some of the possible ways and using tika gives the result you expected, I don't see the code you used so i cannot double check it.

I tried different ways, not all in the code snippet:

  1. Java 7 Files.probeContentType(path)
  2. URLConnection mime detection from file name and content type guessing
  3. JDK 6 JAF API javax.activation.MimetypesFileTypeMap
  4. MimeUtil with all available subclass of MimeDetector I found
  5. Apache Tika
  6. Apache POI scratchpad

Here the test class:

import java.io.BufferedInputStream; import java.io.File; import java.io.FileInputStream; import java.io.InputStream; import java.net.URLConnection; import java.util.Collection;  import javax.activation.MimetypesFileTypeMap;  import org.apache.tika.detect.Detector; import org.apache.tika.metadata.Metadata; import org.apache.tika.mime.MediaType; import org.apache.tika.parser.AutoDetectParser;  import eu.medsea.mimeutil.MimeUtil;  public class FindMime {      public static void main(String[] args) {         File file = new File("C:\\Users\\qwerty\\Desktop\\test.msg");          System.out.println("urlConnectionGuess " + urlConnectionGuess(file));          System.out.println("fileContentGuess " + fileContentGuess(file));          MimetypesFileTypeMap mimeTypesMap = new MimetypesFileTypeMap();          System.out.println("mimeTypesMap.getContentType " + mimeTypesMap.getContentType(file));          System.out.println("mimeutils " + mimeutils(file));          System.out.println("tika " + tika(file));      }      private static String mimeutils(File file) {         try {             MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.MagicMimeMimeDetector");             MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.ExtensionMimeDetector"); //          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.OpendesktopMimeDetector");             MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.WindowsRegistryMimeDetector"); //          MimeUtil.registerMimeDetector("eu.medsea.mimeutil.detector.TextMimeDetector");             InputStream is = new BufferedInputStream(new FileInputStream(file));             Collection<?> mimeTypes = MimeUtil.getMimeTypes(is);             return mimeTypes.toString();         } catch (Exception e) {             // TODO: handle exception         }         return null;     }      private static String tika(File file) {         try {             InputStream is = new BufferedInputStream(new FileInputStream(file));             AutoDetectParser parser = new AutoDetectParser();             Detector detector = parser.getDetector();             Metadata md = new Metadata();             md.add(Metadata.RESOURCE_NAME_KEY, "test.msg");             MediaType mediaType = detector.detect(is, md);             return mediaType.toString();         } catch (Exception e) {             // TODO: handle exception         }         return null;     }      private static String urlConnectionGuess(File file) {         String mimeType = URLConnection.guessContentTypeFromName(file.getName());         return mimeType;     }      private static String fileContentGuess(File file) {         try {             InputStream is = new BufferedInputStream(new FileInputStream(file));             return URLConnection.guessContentTypeFromStream(is);         } catch (Exception e) {             e.printStackTrace();             return null;         }     }  } 

and this is the output:

urlConnectionGuess null fileContentGuess null mimeTypesMap.getContentType application/octet-stream mimeutils application/msword,application/x-hwp tika application/vnd.ms-outlook 

Updated I added this method to test other ways with Tika:

private static void tikaMore(File file) {     Tika defaultTika = new Tika();     Tika mimeTika = new Tika(new MimeTypes());     Tika typeTika = new Tika(new TypeDetector());     try {         System.out.println(defaultTika.detect(file));         System.out.println(mimeTika.detect(file));         System.out.println(typeTika.detect(file));     } catch (Exception e) {         // TODO: handle exception     } } 

tested with a msg file without extension:

application/vnd.ms-outlook application/octet-stream application/octet-stream 

tested with a txt file renamed to msg:

text/plain text/plain application/octet-stream 

seems that the most simple way by using the empty constructor is the most reliable in this case.

Update you can make your own checker using Apache POI scratchpad, for example this is a simple implementation to get the mime of the message or null if the file is not in the proper format (usually org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature):

import org.apache.poi.hsmf.MAPIMessage;  public class PoiMsgMime {      public String getMessageMime(String fileName) {         try {             new MAPIMessage(fileName);             return "application/vnd.ms-outlook";         } catch (Exception e) {             return null;         }     } } 
like image 70
Paizo Avatar answered Oct 05 '22 07:10

Paizo