Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Greater than and less than symbol in regular expressions

Tags:

regex

php

I am new to regular expressions, and I am just tired by really studying all of the regex charatcer and all. I need to know what is the purpose of greater than symbol in regex for eg:

preg_match('/(?<=<).*?(?=>)/', 'sadfas<[email protected]>', $email);

Please tell me the use of greater than symbo and less than symbol in regex.

like image 771
badu Avatar asked Jan 11 '14 14:01

badu


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

How do you denote special characters in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What does the plus character [+] do in regex?

Inside a character class, the + char is treated as a literal char, in every regex flavor. [+] always matches a single + literal char. E.g. in c#, Regex. Replace("1+2=3", @"[+]", "-") will result in 1-2=3 .

What is the difference between .*? And * regular expressions?

*? is non-greedy. * will match nothing, but then will try to match extra characters until it matches 1 , eventually matching 101 . All quantifiers have a non-greedy mode: .


2 Answers

The greater than symbol simply matches the literal > at the end of your target string.

The less than symbol is not so simple. First let's review the lookaround syntax:

The pattern (?<={pattern}) is a positive lookbehind assertion, it tests whether the currently matched string is preceded by a string matching {pattern}.

The pattern (?={pattern}) is a positive lookahead assertion, it tests whether the currently matched string is followed by a string matching {pattern}.

So breaking down your expression

  • (?<=<) assert that the currently matched string is preceded by a literal <
  • .*? match anything zero or more times, lazily
  • (?=>) assert than the currently matched string is followed by a literal >

Putting it all together the pattern will extract [email protected] from the input string you have given it.

like image 112
Boris the Spider Avatar answered Nov 14 '22 20:11

Boris the Spider


Your regex is using lookarounds to capture email address between < and > characters. In your example input it captures [email protected].

Explanation:

(?<=<) Positive Lookbehind - Assert that the regex below can be matched
< matches the character < literally
.*? matches any character (except newline)
Quantifier: Between zero and unlimited times, as few times as possible,
expanding as needed [lazy]
(?=>) Positive Lookahead - Assert that the regex below can be matched
> matches the character > literally

Online Demo: http://regex101.com/r/yH6tY8

like image 24
anubhava Avatar answered Nov 14 '22 21:11

anubhava