Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling blank lines in email headers

Came across a few mails which aren't RFC compliant

authentication-results: spf=none (sender IP is ) smtp.mailfrom=**@********.**;

Content-Type: multipart/related;
    boundary="_004_2039b206f2a54788ba6a101978bd3f82DBXPR07MB013eurprd07pro_";
    type="multipart/alternative"
MIME-Version: 1.0

For example, the mail above has a blank line in the header (before Content-Type). Libraries which strictly abide by RFC (for example https://github.com/mikel/mail), won;t be able to parse them. Apple Mail, Thunderbird do manage to handle such mails.

Have tried to browse through Thunderbird's codebase, being unfamiliar with C++, I just managed to find https://github.com/mozilla/releases-comm-central/blob/1f2a40ec2adb448043de0ae96d93b44a9bfefcd1/mailnews/mime/src/mimemsg.cpp

Can someone point me to the part of the Thunderbird's codebase where mail parsing happens, or any opensource libraries/apps which handle such non complaint mails.

EDIT:

Hexdump of the blank line. It contains a space.

00013e0: 2a2a 2a2a 2a2a 2e2a 2a3b 0d0a 200d 0a43  ******.**;.. ..C
00013f0: 6f6e 7465 6e74 2d54 7970 653a 206d 756c  ontent-Type: mul
0001400: 7469 7061 7274 2f72 656c 6174 6564 3b0d  tipart/related;.
like image 257
nisanth074 Avatar asked Oct 01 '22 11:10

nisanth074


1 Answers

The Ruby code in the referenced ruby library is not confirming to the RFC, which allows multiple lines to be folded into a single header line. The rule is that a continuation header line (folding headers) should start with a space -- the exact details are in RFC 5322, section "Folding White Space and Comments".

The most likely problem is that the Ruby code is reading each line and trimming the white spaces before parsing -- thus failing to detect the extra line is in fact belonging to the previous header -- the extra line does however not add anything to the header (as it contains just a space), but it is valid syntax.

EDIT:

The non compliant behaviour was introduced in commit 17783f8536fc09b926c7425dbacfc35e0e851ef5. One of the side effects introduced is splitting the headers & body on an empty folded header

CRLF = /\r\n/
white_space = %Q|\x9\x20|
WSP = /[#{white_space}]/

header_part, body_part = raw_source.split(/#{CRLF}#{WSP}*#{CRLF}(?!#{WSP})/m, 2)

The issue was raised in commit a2a45597bce66ebe788cedaaab848a37bd04b25a, but the consensus was to not break existing behavior.

like image 128
Soren Avatar answered Oct 03 '22 00:10

Soren