Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby - Parsing emails from text or html [closed]

Tags:

ruby

From what I understand, Regex is not the best thing to use when scanning for emails within a given document. I am wondering if there are any alternatives to this? Or some best practice way that I'm unaware of?

like image 609
Mr. Demetrius Michael Avatar asked Jan 12 '13 01:01

Mr. Demetrius Michael


1 Answers

For parsing jobs it is always a good idea to rely on libraries. You are right, a library will always have dealt with the problem in more detail than a regular expression, considering different cases, etc.

One Ruby library for parsing emails is Mail:

Mail is an internet library for Ruby that is designed to handle emails generation, parsing and sending in a simple, rubyesque manner.

[...] Mail has been designed with a very simple object oriented system that really opens up the email messages you are parsing, if you know what you are doing, you can fiddle with every last bit of your email directly.

Here is an example of how the email's content is accessed:

mail = Mail.read('/path/to/message.eml')

mail.envelope.from   #=> '[email protected]'
mail.from.addresses  #=> ['[email protected]', '[email protected]']
mail.sender.address  #=> '[email protected]'
mail.to              #=> '[email protected]'
mail.cc              #=> '[email protected]'
mail.subject         #=> "This is the subject"
mail.date.to_s       #=> '21 Nov 1997 09:55:06 -0600'
mail.message_id      #=> '<[email protected]>'
mail.body.decoded    #=> 'This is the body of the email...

It also enables you to parse a multipart email, as well as test and extract the attachments.

like image 104
Konrad Reiche Avatar answered Oct 03 '22 11:10

Konrad Reiche