I have wrote a script to grab different fields in an HTML file and populate variables with the results. I'm having issues with the regular expression for grabbing the email. Here is some sample code: <pre class="prettyprint"><code>$txt='<a name="InternetMail_P3"></a>First.Last@company-name.com' $re='.*?'+'([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})' if ($txt -match $re) { $email1=$matches[1] write-host "$email1" } </code></pre> I get the following error: <pre class="prettyprint"><code>Bad argument to operator '-match': parsing ".*?([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\ .)+[a-zA-Z]{2,7})([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})" - [x-y] range in reverse order.. At line:7 char:16 + if ($txt -match <<<< $re) + CategoryInfo : InvalidOperation: (:) [], RuntimeException + FullyQualifiedErrorId : BadOperatorArgument </code></pre> What am I missing here? Also, is there a better regex for email? Thanks in advance.

Actually any regex that is suitable for .Net or C# will work for PowerShell. And you could find tons and tons samples at stackoverflow and inet. For example: How to Find or Validate an Email Address: The Official Standard: RFC 2822 <pre class="prettyprint"><code>$txt='<a name="InternetMail_P3"></a>First.Last@company-name.com' $re="[a-z0-9!#\$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#\$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?" [regex]::MAtch($txt, $re, "IgnoreCase ") </code></pre> But there is also other part of this answer. Regex by nature is not very suitable to parse XML/HTML. You could find more details here: Using regular expressions to parse HTML: why not? To provide real solution, I'm recomment first <ol> <li>convert HTML → XHTML </li> <li>walk over XML tree </li> <li>work with individual nodes one by one, even using regex.</li> </ol>

When it comes to email validation I usually choose the short version of RFC 2822 being: <blockquote> [a-z0-9!#$%&'*+/=?^_<code>{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_</code>{|}~-]+)*@(?:a-z0-9?.)+a-z0-9? </blockquote> You can find more info about email validation here

Using Regex in Powershell to grab email

Q: How do I validate an email address in PowerShell?

This command syntactically validates the email address johndoe@example.com and returns a boolean value which tells if the address is valid or not. Note that the default verification level is Syntax. To specify a different verification level use the -Level parameter.

Tags:

regex

email

powershell

I have wrote a script to grab different fields in an HTML file and populate variables with the results. I'm having issues with the regular expression for grabbing the email. Here is some sample code:

$txt='<p class=FillText><a name="InternetMail_P3"></a>[email protected]</p>'

$re='.*?'+'([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})'

if ($txt -match $re)
{
    $email1=$matches[1]
    write-host "$email1"
}

I get the following error:

Bad argument to operator '-match': parsing ".*?([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\
.)+[a-zA-Z]{2,7})([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})" - [x-y] range in reverse order..
At line:7 char:16
+ if ($txt -match <<<<  $re)
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : BadOperatorArgument

What am I missing here? Also, is there a better regex for email?

Thanks in advance.

637

asked Jul 19 '12 15:07

gp80586

2 Answers

Actually any regex that is suitable for .Net or C# will work for PowerShell. And you could find tons and tons samples at stackoverflow and inet. For example: How to Find or Validate an Email Address: The Official Standard: RFC 2822

$txt='<p class=FillText><a name="InternetMail_P3"></a>[email protected]</p>'
$re="[a-z0-9!#\$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#\$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?"
[regex]::MAtch($txt, $re, "IgnoreCase ")

But there is also other part of this answer. Regex by nature is not very suitable to parse XML/HTML. You could find more details here: Using regular expressions to parse HTML: why not?

To provide real solution, I'm recomment first

convert HTML → XHTML
walk over XML tree
work with individual nodes one by one, even using regex.

111

answered Nov 15 '22 06:11

Akim

When it comes to email validation I usually choose the short version of RFC 2822 being:

[a-z0-9!#$%&'*+/=?^_{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_{|}~-]+)*@(?:a-z0-9?.)+a-z0-9?

You can find more info about email validation here

answered Nov 15 '22 07:11

Pierluc SS

Related questions
                            
                                Regular expression for an ISBN 13
                            
                                How to delete regex match text in emacs?
                            
                                Read specific div from HttpResponse
                            
                                Regular Expression to Search+Replace href="URL"
                            
                                Why is this not a fixed width pattern?
                            
                                Java Regex, less than and more than sign
                            
                                Preg_match_all returning array within array?
                            
                                regex to find substring
                            
                                Applying a Regex to a Substring Without using String Slice
                            
                                Removing url from text using ruby
                            
                                How can I know which portion of a Perl regex is matched by a string?
                            
                                Python regex replace to create smiley faces
                            
                                Powershell Select-Object vs ForEach on Select-String results
                            
                                JSF 2.0 validateRegex with own validator message
                            
                                Using replace and regex to capitalize first letter of each word of a string in JavaScript
                            
                                How to get id from String with using Regex
                            
                                Simple Regular Expression (Regex) issue (.net)
                            
                                How do I split string using String.split() without having trailing/leading spaces or empty values?
                            
                                Regular expression for detecting round or square brackets
                            
                                Python Regex - Remove special characters but preserve apostraphes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With