Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Documentation on Apple Mail's .emlx data structure(s) (for conversion purposes)?

This appears to be a rare gem: where to find documentation on the structure of Apple Mail's .emlx files (and their partial variants, and the meaning of the directory structures). The docs do not appear to exist on Apple's site, nor can I find any reasonable mention of it via Google.

The point of this is the creation of a bash/ruby/python/insert-script-langauge-here script to convert a mess of these files into something usable/pliable, like Maildir or Mbox. The ultimate goal is to migrate a snapshot of a user's /Library/Mail store into an existing Dovecot setup, which uses a form of Maildir.

Yes, I am aware of this program but it does not address the solution I am after. Converting 20 mailboxes by hand and manually inserting them into an existing installation will require more hours than just writing a script that digests the messages into something else and then automatically storing them where they should be. Nevermind that there are potentially a half-dozen more users that will require this procedure. So it's worth my time to script it up.

Please vote to close the duplicate of this question while it is pending deletion, instead of voting for this question to close. For some reason, there are occasional posting glitches when using Chrome as a browser.

FOLLOW-UP: It appears that the format really is undocumented, and that most sources have reverse-engineered it. If I have time I will attempt to do so my self; and if I'm successful, I will post a 2nd follow-up with the details of my findings.

like image 316
Avery Payne Avatar asked May 19 '09 18:05

Avery Payne


2 Answers

Here is an emlx2mbox converter in ruby: Mailbox Converter.

I don't think it was written from any documentation of the spec, but it has undergone multiple updates, so hopefully evolved to handle at least some of the quirks of the format. The source code is about 250 lines long, and it looks readable and well-commented.

like image 60
Matt G Avatar answered Oct 30 '22 16:10

Matt G


A few more information documenting emlx format.

The message is composed:

  • a byte count for the message on the first line
  • a MIME dump of the message
  • an XML plist

The XML plist contains certains code such as

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>date-sent</key>
        <real>1362211252</real>
        <key>flags</key>
        <integer>8590195713</integer>
        <key>original-mailbox</key>
        <string>imap://****@127.0.0.1:143/mail/2013/03</string>
        <key>remote-id</key>
        <string>252</string>
        <key>subject</key>
        <string>Re: Foobar</string>
</dict>

The flags have been described by jwz and represents a 30 bit integer:

0      read                      1 << 0
1      deleted                   1 << 1
2      answered                  1 << 2
3      encrypted                 1 << 3
4      flagged                   1 << 4
5      recent                    1 << 5
6      draft                     1 << 6
7      initial (no longer used)  1 << 7
8      forwarded                 1 << 8
9      redirected                1 << 9
10-15  attachment count          3F << 10 (6 bits)
16-22  priority level            7F << 16 (7 bits)
23     signed                    1 << 23
24     is junk                   1 << 24
25     is not junk               1 << 25
26-28  font size delta           7 << 26 (3 bits)
29     junk mail level recorded  1 << 29
30     highlight text in toc     1 << 30
31     (unused)

Sending myself a simple message and removing some details, so you can see the full data structure of emlx files.

875       
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on ******.*********.***
X-Spam-Level: 
X-Spam-Status: No, score=-3.2 required=4.2 tests=BAYES_00,RP_MATCHES_RCVD,
        SPF_PASS,TVD_SPACE_RATIO autolearn=ham version=3.3.2
Received: from [127.0.0.1] (******.*********.*** [***.**.**.**])
        by ******.*********.*** (8.14.5/8.14.5) with ESMTP id r2TN8m4U099571
        for <****@*********.***>; Fri, 29 Mar 2013 19:08:48 -0400 (EDT)
        (envelope-from ****@*********.***)
Subject: very simple
From: Karl Dubost <****@*********.***>
Content-Type: text/plain; charset=us-ascii
Message-Id: <4E83618E-BB56-404F-8595-87352648ADC7@*********.***>
Date: Fri, 29 Mar 2013 19:09:06 -0400
To: Karl Dubost <****@*********.***>
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v1283)
X-Mailer: Apple Mail (2.1283)

message Foo
-- 
Karl Dubost
http://www.la-grange.net/karl/
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>date-sent</key>
        <real>1364598546</real>
        <key>flags</key>
        <integer>8590195713</integer>
        <key>original-mailbox</key>
        <string>imap://********@127.0.0.1:11143/mail/2013/03</string>
        <key>remote-id</key>
        <string>41147</string>
        <key>subject</key>
        <string>very simple</string>
</dict>
</plist>
like image 36
karlcow Avatar answered Oct 30 '22 15:10

karlcow