Specifically when does ^
mean "match start" and when does it mean "not the following" in regular expressions?
From the Wikipedia article and other references, I've concluded it means the former at the start and the latter when used with brackets, but how does the program handle the case where the caret is at the start and at a bracket? What does, say, ^[b-d]t$
match?
The caret character ^ anchors a regular expression to the beginning of the input, or (for multi-line regular expressions) to the beginning of a line. If it is preceded by a pattern that must match a non-empty sequence of (non-newline) input characters, then the entire regular expression cannot match anything.
*$ means - match, from beginning to end, any character that appears zero or more times. Basically, that means - match everything from start to end of the string. This regex pattern is not very useful. Let's take a regex pattern that may be a bit useful.
These are called anchor characters: If a caret ( ^ ) is at the beginning of the entire regular expression, it matches the beginning of a line. If a dollar sign ( $ ) is at the end of the entire regular expression, it matches the end of a line.
In posix-ere and other regex flavors, outside a character class ( [...] ), + acts as a quantifier meaning "one or more, but as many as possible, occurrences of the quantified pattern*.
^
only means "not the following" when inside and at the start of []
, so [^...]
.
When it's inside []
but not at the start, it means the actual ^
character.
When it's escaped (\^
), it also means the actual ^
character.
In all other cases it means start of the string / line (which one is language / setting dependent).
So in short:
[^abc]
-> not a, b or c [ab^cd]
-> a, b, ^ (character), c or d\^
-> a ^
character So ^[b-d]t$
means:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With