Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JavaScript split string with .match(regex)

From the Mozilla Developer Network for function split():

The split() method returns the new array.

When found, separator is removed from the string and the substrings are returned in an array. If separator is not found or is omitted, the array contains one element consisting of the entire string. If separator is an empty string, str is converted to an array of characters.

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array. However, not all browsers support this capability.

Take the following example:

var string1 = 'one, two, three, four';
var splitString1 = string1.split(', ');
console.log(splitString1); // Outputs ["one", "two", "three", "four"]

This is a really clean approach. I tried the same with a regular expression and a somewhat different string:

var string2 = 'one split two split three split four';
var splitString2 = string2.split(/\ split\ /);
console.log(splitString2); // Outputs ["one", "two", "three", "four"]

This works just as well as the first example. In the following example, I have altered the string once more, with 3 different delimiters:

var string3 = 'one split two splat three splot four';
var splitString3 = string3.split(/\ split\ |\ splat\ |\ splot\ /);
console.log(splitString3); // Outputs ["one", "two", "three", "four"]

However, the regular expression gets relatively messy right now. I can group the different delimiters, however the result will then include these delimiters:

var string4 = 'one split two splat three splot four';
var splitString4 = string4.split(/\ (split|splat|splot)\ /);
console.log(splitString4); // Outputs ["one", "split", "two", "splat", "three", "splot", "four"]

So I tried removing the spaces from the regular expression while leaving the group, without much avail:

var string5 = 'one split two splat three splot four';
var splitString5 = string5.split(/(split|splat|splot)/);
console.log(splitString5);

Although, when I remove the parentheses in the regular expression, the delimiter is gone in the split string:

var string6 = 'one split two splat three splot four';
var splitString6 = string6.split(/split|splat|splot/);
console.log(splitString6); // Outputs ["one ", " two ", " three ", " four"]

An alternative would be to use match() to filter out the delimiters, except I don't really understand how reverse lookaheads work:

var string7 = 'one split two split three split four';
var splitString7 = string7.match(/((?!split).)*/g);
console.log(splitString7); // Outputs ["one ", "", "plit two ", "", "plit three ", "", "plit four", ""]

It doesn't match the whole word to begin with. And to be honest, I don't even know what's going on here exactly.


How do I properly split a string using regular expressions without having the delimiter in my result?

like image 657
Audite Marlow Avatar asked Jun 15 '16 14:06

Audite Marlow


People also ask

Can I use regex in Split in JavaScript?

You do not only have to use literal strings for splitting strings into an array with the split method. You can use regex as breakpoints that match more characters for splitting a string.

Can we use regex in split a string?

Split(String, Int32, Int32) Splits an input string a specified maximum number of times into an array of substrings, at the positions defined by a regular expression specified in the Regex constructor. The search for the regular expression pattern starts at a specified character position in the input string.

How do you split a string by the occurrences of a regex pattern?

split() method split the string by the occurrences of the regex pattern, returning a list containing the resulting substrings.

How can I split a string into two JavaScript?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.


2 Answers

Use a non-capturing group as split regex. By using non-capturing group, split matches will not be included in resulting array.

var string4 = 'one split two splat three splot four';
var splitString4 = string4.split(/\s+(?:split|splat|splot)\s+/);
console.log(splitString4);
// Output => ["one", "two", "three", "four"]
like image 81
anubhava Avatar answered Oct 02 '22 06:10

anubhava


If you want to use match you can write it like

'one split two split three split four'.match(/(\b(?!split\b)[^ $]+\b)/g)
["one", "two", "three", "four"]

What it does?

  • \b Matches a word boundary

  • (?!split\b) Negative look ahead, check if the word is not split

  • [^ $]+ Matches anything other than space or $, end of string. This pattern will match a word, the look ahead ensures that what it matches is not split.

  • \b Matches the word end.

like image 44
nu11p01n73R Avatar answered Oct 02 '22 08:10

nu11p01n73R