Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regexp: match character group or end of line

Tags:

python

regex

How do you match ^ (begin of line) and $ (end of line) in a [] (character group)?


simple example

haystack string: zazty

rules:

  1. match any "z" or "y"
  2. if preceded by
    1. an "a", "b"; or
    2. at the beginning of the line.

pass: match the first two "z"

a regexp that would work is:
(?:^|[aAbB])([zZyY])

But I keep thinking it would be much cleaner with something like that meant beginning/end of line inside the character group
[^aAbB]([zZyY])
(in that example assumes the ^ means beginning of line, and not what it really is there, a negative for the character group)


note: using python. but knowing that on bash and vim would be good too.

Update: read again the manual it says for set of chars, everything lose it's special meaning, except the character classes (e.g. \w)

down on the list of character classes, there's \A for beginning of line, but this does not work [\AaAbB]([zZyY])

Any idea why?

like image 651
gcb Avatar asked Feb 06 '12 04:02

gcb


People also ask

Which regex matches the end of line?

End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.

Which character is used to match end of lines?

Anchors. Anchors are used to denote a position in a line. represents the end of line only when it is the last character in the regular expression.

Which regex character would you use to indicate that a given string ends with the specified word?

If you only want a match at the absolute very end of the string, use \z (lowercase z instead of uppercase Z).

What is difference [] and () in regex?

This answer is not useful. Show activity on this post. [] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9.


1 Answers

You can't match a ^ or $ within a [] because the only characters with special meaning inside a character class are ^ (as in "everything but") and - (as in "range") (and the character classes). \A and \Z just don't count as character classes.

This is for all (standard) flavours of regex, so you're stuck with (^|[stuff]) and ($|[stuff]) (which aren't all that bad, really).

like image 186
mathematical.coffee Avatar answered Oct 02 '22 12:10

mathematical.coffee