Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex escape with \ or \\?

Tags:

c#

regex

Can someone explain to me when using regular expressions when a double backslash or single backslash needs to be used to escape a character?

A lot of references online use a single backslash and online regex testers work with single backslashes, but in practice I often have to use a double backslash to escape a character.

For example:

"SomeString\."

Works in an online regex tester and matches "SomeString" followed by a dot.

However in practice I have to use a double escape:

if (Regex.IsMatch(myString, "SomeString\\."))
like image 557
Duane Avatar asked Sep 11 '14 09:09

Duane


2 Answers

C# does not have a special syntax for construction of regular expressions, like Perl, Ruby or JavaScript do. It instead uses a constructor that takes a string. However, strings have their own escaping mechanism, because you want to be able to put quotes inside the string. Thus, there are two levels of escaping.

So, in a regular expression, w means the letter "w", while \w means a word character. However, if you make a string "\w", you are escaping the character "w", which makes no sense, since character "w" is not a quote or a backslash, so "w" == "\w". Then this string containing only "w" gets passed to the regexp constructor, and you end up matching the letter "w" instead of any word character. Thus, to pass the backslash to regexp, you need to put in two backslashes in the string literal (\\w): one will be removed when the string literal is interpreted, one will be used by the regular expression.

When working with regular expressions directly (such as on most online regexp testers, or when using verbatim strings @"..."), you don't have to worry about the interpretation of string literals, and you always write just one backslash (except when you want to match the backslash itself, but then you're espacing the backslash for the regexp, not for the string).

like image 182
Amadan Avatar answered Sep 28 '22 06:09

Amadan


\ Is also an escape character for string literals in c# so the first \ is escaping the second \ being passed to the method and the second one is escaping the . in the regex.

Use:

if (Regex.IsMatch(myString, @"SomeString\."))

If you want to avoid double escaping.

like image 45
Ben Robinson Avatar answered Sep 28 '22 04:09

Ben Robinson