Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regular expressions: how to find the bit between the "<>"

Tags:

regex

php

email

In the following string,

Jason <[email protected]>

how can I extract the part inside the angle brackets?

I tried <\w> and it didn't work.

Ideas?

I'm using preg_match() in PHP if that makes a difference.

like image 420
Jason Avatar asked Dec 07 '10 04:12

Jason


3 Answers

user502515 has already given the regex you want.

I'd like to add why your regex <\w> did not work:

\w is the short for the character class [a-zA-Z0-9_] and matches any one character from that class. To match more characters you need to use quantifiers:

  • + for one or more and
  • * for zero or more

Since you want to extract the string matching the pattern you need to enclose the pattern in parenthesis (..) so that it gets captured.

Now your original task was to extract the string between <..>, the regex <(\w+)> will not do the job as the char class \w does not include @.

To match anything you use the regex .* which matches any arbitrary string (without newline).

So the regex <(.*)> matches and captures any string between the angular brackets.

The match is greedy, so if the input string is foo<[email protected]>, bar<bar.com> you'll be extracting [email protected]>, bar<bar.com. To fix this you make the match non-greedy by adding a ? at the end of .* giving us the correct regex <(.*?)>

like image 158
codaddict Avatar answered Sep 29 '22 10:09

codaddict


Use <(.*?)> as regex, then.

like image 26
user502515 Avatar answered Sep 29 '22 08:09

user502515


To get a match between the < char and the next closest >, with no < and > in between (note <.*?> matches strings like <..<...>), you can use

<([^<>]*)>

See the regex demo.

Regex details:

  • < - a < char
  • ([^<>]*) - Group 1: any zero or more chars other than < and >
  • > - a > char.

Code examples

  • c# - var res = Regex.Matches(text, @"<([^<>]*)>").Cast<Match>().Select(x => x.Groups[1].Value).ToList();
  • javascript - const matches = [...Array.from(text.matchAll(/<([^<>]*)>/g), x => x[1])]
  • php - $res = preg_match_all('~<([^<>]*)>~', $text, $matches) ? $matches[1] : "";
  • python - res = re.findall(r'<([^<>]*)>', text)
like image 30
Wiktor Stribiżew Avatar answered Sep 29 '22 10:09

Wiktor Stribiżew