Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there .NET Framework methods to parse an email (MIME)?

Tags:

Is there a class or set of functions built into the .NET Framework (3.5+) to parse raw emails (MIME documents)?

I am not looking for anything fancy or a separate library, it needs to be built-in. I'm going to be using this in some unit tests and need only grab the main headers of interest (To, From, Subject) along with the body (which in this case will always be text and therefore no MIME trees or boundaries). I've written several MIME parsers in the past and if there isn't anything readily available, I'll just put together something from regular expressions. It would be great to be able to do something like:

MailMessage msg = MailMessage.Parse(text); 

Thoughts?

like image 880
Neil C. Obremski Avatar asked Nov 03 '09 19:11

Neil C. Obremski


People also ask

What is a MIME parser?

The MIME parser handles documents received both over the HTTP transport (where the Content-Type appears as an HTTP header) and over other transports (where the Content-Type header is part of the message body). In both cases, set the Content-Type value using the ContentType property in the MIME domain.

What is an email parsing tool?

An email parser is a piece of software that allows you to extract data from incoming emails. Email parsers can be configured to pull specific data fields from incoming emails. By doing so they allow you to convert an unstructured email into easy-to-handle structured data.

What does it mean to parse an email?

Email parsing is the process of using software to look for and extract specific data in an email to avoid manual data entry. Things like order numbers, leads, contact details and more can be found in emails. The challenge with email parsing is that emails are designed for humans, not for machines.


1 Answers

I know you said no external libraries, but I have a library posted on codeplex:

https://bitbucket.org/otac0n/mailutilities

MimeMessage msg = new MimeMessage(/* string, stream, or Byte[] */); 

It has been tested with over 40,000 real-world mail messages.

I'm not too happy with my namespace choice, but... I'm too lazy to change it.


PS:

Internally, my library uses these regexes as a parser:

internal static string FullMessageMatch =     @"\A(?<header>(?:[^\r\n]+\r\n)*)(?<header_term>\r\n)(?<body>.*)\z"; internal static string HeadersMatch =     @"^(?<header_key>[-A-Za-z0-9]+)(?<seperator>:[ \t]*)(?<header_value>([^\r\n]|\r\n[ \t]+)*)(?<terminator>\r\n)"; internal static string HeaderSeperator =     "\r\n"; internal static string KeyValueSeparator =     @"\A:[ \t]*\z"; 
like image 146
John Gietzen Avatar answered Oct 05 '22 04:10

John Gietzen