Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Carets in Regular Expressions

Tags:

regex

caret

Specifically when does ^ mean "match start" and when does it mean "not the following" in regular expressions?

From the Wikipedia article and other references, I've concluded it means the former at the start and the latter when used with brackets, but how does the program handle the case where the caret is at the start and at a bracket? What does, say, ^[b-d]t$ match?

like image 267
Sylvester V Lowell Avatar asked Jun 05 '13 15:06

Sylvester V Lowell


People also ask

What is caret regex?

The caret character ^ anchors a regular expression to the beginning of the input, or (for multi-line regular expressions) to the beginning of a line. If it is preceded by a pattern that must match a non-empty sequence of (non-newline) input characters, then the entire regular expression cannot match anything.

What does *$ mean in regex?

*$ means - match, from beginning to end, any character that appears zero or more times. Basically, that means - match everything from start to end of the string. This regex pattern is not very useful. Let's take a regex pattern that may be a bit useful.

What does a carrot mean in regex?

These are called anchor characters: If a caret ( ^ ) is at the beginning of the entire regular expression, it matches the beginning of a line. If a dollar sign ( $ ) is at the end of the entire regular expression, it matches the end of a line.

What does plus do in regex?

In posix-ere and other regex flavors, outside a character class ( [...] ), + acts as a quantifier meaning "one or more, but as many as possible, occurrences of the quantified pattern*.


1 Answers

^ only means "not the following" when inside and at the start of [], so [^...].

When it's inside [] but not at the start, it means the actual ^ character.

When it's escaped (\^), it also means the actual ^ character.

In all other cases it means start of the string / line (which one is language / setting dependent).

So in short:

  • [^abc] -> not a, b or c
  • [ab^cd] -> a, b, ^ (character), c or d
  • \^ -> a ^ character
  • Anywhere else -> start of string / line.

So ^[b-d]t$ means:

  • Start of line
  • b/c/d character
  • t character
  • End of line
like image 165
Bernhard Barker Avatar answered Oct 12 '22 12:10

Bernhard Barker