Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between using '\s' and an actual space in regex? [duplicate]

Tags:

regex

I have the string id=0515 abcdefghijk. With the goal of matching only the 0515, I created the following regex: ((?<=id=).*(?<=\s)).

My result was 0515 (with the space between the id and letters included).

If I change my regex to the following (replace the '\s' with an actual space), I get my intended result of just the numbers with no space at the end: ((?<=id=).*(?= ))

Is it okay to use an actual space instead of the character code? Or does my regex need more work?

like image 465
Kervvv Avatar asked Dec 11 '17 23:12

Kervvv


People also ask

What does S in regex mean?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.

What does \\ mean in regex?

\\. matches the literal character . . the first backslash is interpreted as an escape character by the Emacs string reader, which combined with the second backslash, inserts a literal backslash character into the string being read. the regular expression engine receives the string \. html?\ ' .

What is capital S in regex?

On the other hand, the \S+ (uppercase S ) matches anything that is NOT matched by \s , i.e., non-whitespace. In regex, the uppercase metacharacter denotes the inverse of the lowercase counterpart, for example, \w for word character and \W for non-word character; \d for digit and \D or non-digit.

What is the difference between .*? And * regular expressions?

*1 , * is greedy - it will match all the way to the end, and then backtrack until it can match 1 , leaving you with 1010000000001 . . *? is non-greedy. * will match nothing, but then will try to match extra characters until it matches 1 , eventually matching 101 .


1 Answers

The difference is that specifically matches a space, while \s will match any whitespace character (\r, \n, \t, \f and \v).

While there's nothing wrong with using a space in a regex, considering you're only looking for the digits, why not simply use \d (which matches any digit, 0 to 9)?

This will cut down your regex signifcantly, and achieve the same result, as can be seen here.

Hope this helps! :)

like image 113
Obsidian Age Avatar answered Oct 23 '22 20:10

Obsidian Age