Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In PHP, how do I extract multiple e-mail addresses from a block of text and put them into an array?

Tags:

regex

php

email

I have a block of text from which I want to extract the valid e-mail addresses and put them into an array. So far I have...

   $string = file_get_contents("example.txt"); // Load text file contents    $matches = array(); //create array    $pattern = '/[A-Za-z0-9_-]+@[A-Za-z0-9_-]+\.([A-Za-z0-9_-][A-Za-z0-9_]+)/'; //regex for pattern of e-mail address    preg_match($pattern, $string, $matches); //find matching pattern 

However, I am getting an array with only one address. Therefore, I am guessing I need to cycle through this process somehow. How do I do that?

like image 849
HumbleHelper Avatar asked Oct 10 '10 16:10

HumbleHelper


2 Answers

You're pretty close, but the regex wouldn't catch all email formats, and you don't need to specify A-Za-z, you can just use the "i" flag to mark the entire expression as case insensitive. There are email format cases that are missed (especially subdomains), but this catches the ones I tested.

$string = file_get_contents("example.txt"); // Load text file contents  // don't need to preassign $matches, it's created dynamically  // this regex handles more email address formats like [email protected], and the i makes it case insensitive $pattern = '/[a-z0-9_\-\+]+@[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';  // preg_match_all returns an associative array preg_match_all($pattern, $string, $matches);  // the data you want is in $matches[0], dump it with var_export() to see it var_export($matches[0]); 

output:

array (   0 => '[email protected]',   1 => '[email protected]',   2 => '[email protected]',   3 => '[email protected]',   4 => '[email protected]', ) 
like image 152
Clay Hinson Avatar answered Oct 07 '22 19:10

Clay Hinson


I know this is not the question you asked but I noticed that your regex is not accepting any address like '[email protected]' or any address with a subdomain. You could replace it with something like :

/[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/ 

which will reject less valid e-mail (although it is not perfect).

I also suggest you read this article on e-mail validation, it is pretty good and informative.

like image 25
Eric-Karl Avatar answered Oct 07 '22 18:10

Eric-Karl