Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Match a literal character", or "match a character literally"?

Tags:

regex

I was making a RegEx using the regex101 tool and read in the explanation field

[.] - the literal character .

[\.] - matches the character . literally

I get lost between "literal character" and "character literally". What is the difference between these two?

like image 225
Rikard Avatar asked Jun 20 '14 06:06

Rikard


People also ask

What is literal matching?

A literal character or matching something literally refers to specifying an actual character in the text: for instance, a to match a , as opposed to a character class such as \w that could also match a .

What are literal characters in regex?

matches an optional character . or - . Although dot ( . ) has special meaning in regex, in a character class (square brackets) any characters except ^ , - , ] or \ is a literal, and do not require escape sequence.

What regular expression would you use to match a single character?

Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.


1 Answers

There is no difference. Sorry, I take that back. The only difference the words that Firas Dib, the author of regx101, chose to explain various tokens.

A literal character or matching something literally refers to specifying an actual character in the text: for instance, a to match a, as opposed to a character class such as \w that could also match a.

You can match a literal period in either of these three ways:

  1. \.
  2. [.]
  3. [\.]

Which Option is Better?

  • Some people like option 2 because it makes it clear you are matching a period, not the catch-all dot. It stands out. For myself, I use \.. Some people will say that using a character class is less optimal, but on modern processors it makes no difference. You pick.
  • Option 3 is over the top and is typically used when someone doesn't know that periods don't need to be escaped inside a character class. In my view it's confusing. What did the author mean? Were they trying to create a character class to match either a backslash or a period, and made a typo? (That would be [\\.]
like image 134
zx81 Avatar answered Oct 20 '22 21:10

zx81