Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ho do I extract a Outlook message nested into another using Apache POI - HSMF?

I am using Apache POI - HSMF to extract attachments from Outlooks msg-files. It works fine except for nested messages. If an msg is attached to another msg I am able to get the files. If a message is nested I get the information but I need the file.

MAPIMessage msg = new MAPIMessage(fileName)
for(AttachmentChunks attachment : msg.getAttachmentFiles()) {
    if(attachment.attachmentDirectory!=null){
        MAPIMessage nestedMsg attachment.attachmentDirectory.getAsEmbededMessage();
        // now save nestedMsg as a msg-file
    }
}

Is it possible to save the nested message file as a regular msg-file?

like image 976
Jan Avatar asked Jan 14 '23 22:01

Jan


1 Answers

Promoting a comment to an answer. I can tell you how to extract out an embedded Outlook Message to a new file, which Apache POI will then happily open. What I'm less sure on is if an embedded message contains everything that Outlook expects to find in a standalone message, so I can't promise that the resulting file will open with Outlook without issues...

First up, embedded resources in Outlook. Depending on the kind of thing it is, it might be stored in a regular byte chunk, in some other kind of special chunk (eg compress RTF), or it might be a self contained sub-directory in the file. Embedded messages are stored in the latter way.

If you want to extract out an embedded message, what you'll want to do is create a new OLE2 file container, using POIFSFileSystem (all Outlook messages are stored in OLE2 containers). Then, you'll want to copy the contents of the embedded message's directory in the source OLE2 container into the root of the new one. Finally, write out that POIFSFileSystem to a new file, and your extraction is complete!

You'll likely want to do something like:

 MAPIMessage msg = new MAPIMessage(new NPOIFSFileSytem(new File("test.msg")));
 if (msg.attachmentChunks != null) {
    int number = 0;
    for (AttachmentChunk att : msg.attachmentChunks) {
        if (att.attachmentDirectory != null) {
           number++;
           POIFSFileSystem newMsg = new POIFSFileSystem();
           EntryUtils.copyNodes( att.attachmentDirectory, newMsg.getRoot() );
           FileOutputStream out = new FileOutputStream("embedded-" + number + ".msg");
           newMsg.write(out);
           out.close();
        }
    }
 }

If Outlook has a sulk, try opening the source file in Outlook, saving the embedded message to a new file, and then using the likes of org.apache.poi.poifs.dev.POIFSLister and org.apache.poi.poifs.dev.POIFSDump to compare the outlook-extracted and POI-extracted files, and see if you can spot any changes that outlook does....

like image 104
Gagravarr Avatar answered Jan 27 '23 12:01

Gagravarr