Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match tags like <A>, <BB>, <CCC> but not <ABC>

Tags:

regex

I need a regex to match tags that looks like <A>, <BB>, <CCC>, but not <ABC>, <aaa>, <>. so the tag must consist of the same uppercase letter, repeated. I've tried <[A-Z]+>, but that doesn't work. of course I can write something like <(A+|B+|C+|...)> and so on, but I wonder if there's a more elegant solution.

like image 746
regex4html Avatar asked Jun 24 '10 13:06

regex4html


People also ask

What does \+ mean in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

How do you match brackets in regex?

Brackets indicate a set of characters to match. Any individual character between the brackets will match, and you can also use a hyphen to define a set. You can use the ^ metacharacter to negate what is between the brackets.


1 Answers

You can use something like this (see this on rubular.com):

<([A-Z])\1*>

This uses capturing group and backreference. Basically:

  • You use (pattern) to "capture" a match
  • You can then use \n in your pattern, where n is the group number, to "refer back" to what that group matched

So in this case:

  • Group 1 captures ([A-Z]), an uppercase letter immediately following <
  • Then we see if we can match \1*, i.e. zero or more of that same letter

References

  • regular-expressions.info/Grouping and Backreference
like image 85
polygenelubricants Avatar answered Sep 22 '22 15:09

polygenelubricants