Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between these regexes

I'm reading Ionic's source code. I came across this regex, and i"m pretty baffled by it.

([\s\S]+?)

Ok, it's grouping on every char that is either a white space, or non white space???

Why didn't they just do

(.+?)

Am I missing something?

like image 281
user133688 Avatar asked Sep 11 '15 22:09

user133688


People also ask

What is the difference between and * in regex?

* means zero-or-more, and + means one-or-more. So the difference is that the empty string would match the second expression but not the first.

What is the difference between A * and A+ regular expression?

Note that a* means zero or more occurrence of a in the string while a+ means that one or more occurrence of a in the string. That means a* denotes language L = {є , a, aa, aaa, ….}

What is the * in regular expression?

is called the wildcard character. Example : The Regular expression . * will tell the computer that any character can be used any number of times. Optional character – ( ? )

What does .*?) Mean in regex?

(. *?) matches any character ( . ) any number of times ( * ), as few times as possible to make the regex match ( ? ). You'll get a match on any string, but you'll only capture a blank string because of the question mark.


2 Answers

The . matches any symbol but a newline. In order to make it match a newline, in most languages there is a modifier (dotall, singleline). However, in JS, there is no such a modifier.

Thus, a work-around is to use a [\s\S] character class that will match any character, including a newline, because \s will match all whitespace and \S will match all non-whitespace characters. Similarly, one could use [\d\D] or [\w\W].

Also, there is a [^] pattern to match the same thing in JS, but since it is JavaScript-specific, the regexes containing this pattern are not portable between regex flavors.

The +? lazy quanitifier matches 1 or more symbols conforming to the preceding subpattern, but as few as possible. Thus, it will match just 1 symbol if used like this, at the end of the pattern.

like image 73
Wiktor Stribiżew Avatar answered Oct 02 '22 21:10

Wiktor Stribiżew


In many realizations of Regexp "." doesn't match new lines. So they use "[\s\S]" as a little hack =)

like image 33
Walking.In.The.Air Avatar answered Oct 02 '22 20:10

Walking.In.The.Air