Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: Use start of line/end of line signs (^ or $) in different context

Tags:

regex

While doing some small regex task I came upon this problem. I have a string that is a list of tags that looks e.g like this:
foo,bar,qux,garp,wobble,thud

What I needed to do was to check if a certain tag, e.g. 'garp' was in this list. (What it finally matches is not really important, just if there is a match or not.)

My first and a bit stupid try at this was to use the following regex:
[^,]garp[,$]

My idea was that before 'garp' there should either be the start of the line/string or a comma, after 'garp' there should be either a comma or the end of the line/string.

Now, it is instantly obvious that this regex is wrong: Both ^ and $ change their behaviour in the context of the character class [ ].

What I finally came up with is the following:
^garp$|^garp,|,garp,|,garp$

This regex just handles the 4 cases one by one. (Tag at beginning of list, in the center, at the end, or as the only element of the list.) The last regex is somehow a bit ugly in my eyes and just for funs sake I'd like to make it a bit more elegant.

Is there a way how the start of line/end of line characters (^ and $) can be used in the context of character classes?

EDIT: Ok, some more info was wished so here it is: I'm using this within an Oracle SQL statement. This sadly does not allow any look-around assertions but as I'm only interested if there is a match or not (and not what is matched) this does not really affect me here. The tags can contain non-alphabetical characters like - or _ so \bgarp\b would not work. Also one tag can contain an other tag as SilentGhost said, so /garp/ doesnt work either.

like image 828
fgysin Avatar asked Mar 31 '10 11:03

fgysin


People also ask

How do you specify the end of a line in RegEx?

End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.

What is \r and \n in RegEx?

\n. Matches a newline character. \r. Matches a carriage return character.

What is the difference between \b and \b in regular expression?

\B is the negated version of \b. \B matches at every position where \b does not. Effectively, \B matches at any position between two word characters as well as at any position between two non-word characters.

What does \r represent in RegEx?

Definition and Usage The \r metacharacter matches carriage return characters.


2 Answers

You can't use ^ and $ in character classes in the way you wish - they will be interpreted literally, but you can use an alternation to achieve the same effect:

(^|,)garp(,|$) 
like image 73
Mark Byers Avatar answered Oct 08 '22 22:10

Mark Byers


you just need to use word boundary (\b) instead of ^ and $:

\bgarp\b 
like image 20
SilentGhost Avatar answered Oct 09 '22 00:10

SilentGhost