Is there a class or set of functions built into the .NET Framework (3.5+) to parse raw emails (MIME documents)?
I am not looking for anything fancy or a separate library, it needs to be built-in. I'm going to be using this in some unit tests and need only grab the main headers of interest (To, From, Subject) along with the body (which in this case will always be text and therefore no MIME trees or boundaries). I've written several MIME parsers in the past and if there isn't anything readily available, I'll just put together something from regular expressions. It would be great to be able to do something like:
MailMessage msg = MailMessage.Parse(text);
Thoughts?
The MIME parser handles documents received both over the HTTP transport (where the Content-Type appears as an HTTP header) and over other transports (where the Content-Type header is part of the message body). In both cases, set the Content-Type value using the ContentType property in the MIME domain.
An email parser is a piece of software that allows you to extract data from incoming emails. Email parsers can be configured to pull specific data fields from incoming emails. By doing so they allow you to convert an unstructured email into easy-to-handle structured data.
Email parsing is the process of using software to look for and extract specific data in an email to avoid manual data entry. Things like order numbers, leads, contact details and more can be found in emails. The challenge with email parsing is that emails are designed for humans, not for machines.
I know you said no external libraries, but I have a library posted on codeplex:
https://bitbucket.org/otac0n/mailutilities
MimeMessage msg = new MimeMessage(/* string, stream, or Byte[] */);
It has been tested with over 40,000 real-world mail messages.
I'm not too happy with my namespace choice, but... I'm too lazy to change it.
Internally, my library uses these regexes as a parser:
internal static string FullMessageMatch = @"\A(?<header>(?:[^\r\n]+\r\n)*)(?<header_term>\r\n)(?<body>.*)\z"; internal static string HeadersMatch = @"^(?<header_key>[-A-Za-z0-9]+)(?<seperator>:[ \t]*)(?<header_value>([^\r\n]|\r\n[ \t]+)*)(?<terminator>\r\n)"; internal static string HeaderSeperator = "\r\n"; internal static string KeyValueSeparator = @"\A:[ \t]*\z";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With