Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determine Content Type of Files in Zip Archive in Java

Environment used is Google App Engine. The zip file was uploaded in BlobStore.

I have the following code:

ZipInputStream zis = ...
ZipEntry ze = zis.getNextEntry();
while( ze != null){
    System.out.println(ze.getName());
    ze = zis.getNextEntry();
}

How to determine the content type of each file in zip archive? ze.getName method display the name of the file. How about the file type?

Thanks

like image 962
JR Galia Avatar asked Dec 27 '22 14:12

JR Galia


1 Answers

You can use the mime type instead of trying to guess by the file extensions, that may be missing in some cases. Here are the options to establish the mime type of a file:

  1. Using javax.activation.MimetypesFileTypeMap, like:

    System.out.println("Mime Type of " + f.getName() + " is " +
        new MimetypesFileTypeMap().getContentType(f));
    
  2. Using java.net.URL

    URL u = new URL(fileUrl);
    URLConnection uc = u.openConnection();
    type = uc.getContentType();
    
  3. Using Apache Tika

    ContentHandler contenthandler = new BodyContentHandler();
    Metadata metadata = new Metadata();
    metadata.set(Metadata.RESOURCE_NAME_KEY, f.getName());
    Parser parser = new AutoDetectParser();
    // OOXMLParser parser = new OOXMLParser();
    parser.parse(is, contenthandler, metadata);
    System.out.println("Mime: " + metadata.get(Metadata.CONTENT_TYPE));
    System.out.println("Title: " + metadata.get(Metadata.TITLE));
    System.out.println("Author: " + metadata.get(Metadata.AUTHOR));
    System.out.println("content: " + contenthandler.toString());
    
  4. Using JMimeMagic

    MagicMatch match = parser.getMagicMatch(f);
    System.out.println(match.getMimeType()) ;
    
  5. Using mime-util

    Collection<?> mimeTypes = MimeUtil.getMimeTypes(f);
    
  6. Using DROID

    Droid (Digital Record Object Identification) is a software tool to 
    perform automated batch identification of file formats.
    
  7. Aperture framework

    Aperture is an open source library and framework for crawling and indexing
    information sources such as file systems, websites and mail boxes.
    

See Get the Mime Type from a File for more details for each of the above options.

In this case the easiest way is to use the first solution, javax.activation.MimetypesFileTypeMap, like:

MimetypesFileTypeMap mtft = new MimetypesFileTypeMap();
String mimeType = mtft.getContentType(ze.getName());
System.out.println(ze.getName()+" type: "+ mimeType);
like image 171
dan Avatar answered Jan 18 '23 22:01

dan