Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use of capture groups in String.split() [duplicate]

Tags:

$ node
> "ababaabab".split(/a{2}/)
[ 'abab', 'bab' ]
> "ababaabab".split(/(a){2}/)
[ 'abab', 'a', 'bab' ]
>

So, this doesn't make sense to me. Can someone explain it? I don't get why the 'a' shows up.

Note: I am trying to match for doubled line endings (possibly on windows files) so I am splitting on /(\r?\n){2}/. However I get extraneous '\015\n' entries in my array (note \015 == \r).

Why are these showing up?

Note: also affects JS engine in browsers so this is specific to JS not node.

like image 276
Steven Lu Avatar asked Jan 29 '14 00:01

Steven Lu


People also ask

What is the purpose of regex capture groups?

Capturing groups are a way to treat multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses. For example, the regular expression (dog) creates a single group containing the letters "d", "o", and "g".

Can I use named capturing groups?

Numbers for Named Capturing Groups. Mixing named and numbered capturing groups is not recommended because flavors are inconsistent in how the groups are numbered. If a group doesn't need to have a name, make it non-capturing using the (?:group) syntax.

What split str?

Definition and Usage The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.


1 Answers

In your second result, a is appearing because you've wrapped it in a capture group () (parentheses).

If you want to not include it but you still require a conditional group, use a non-capturing group: (?:a). The questionmark-colon can be used inside any capture group and it will be omitted from the resulting list of captures.

Here's a simple example of this in action: http://regex101.com/r/yM1vM4

like image 178
brandonscript Avatar answered Sep 24 '22 00:09

brandonscript