Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting a string into an array of n words

I'm trying to turn this:

"This is a test this is a test"

into this:

["This is a", "test this is", "a test"]

I tried this:

const re = /\b[\w']+(?:[^\w\n]+[\w']+){0,2}\b/
const wordList = sample.split(re)
console.log(wordList)

But I got this:

[ '',
  ' ',
  ' ']

Why is this?

(The rule is to split the string every N words.)

like image 625
alex Avatar asked Nov 26 '16 10:11

alex


People also ask

How do you split a string into an array of words?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.

How can I split a string into segments of N characters?

To split a string into substring on N characters, call the match() method on the string, passing it the following regular expression /. {1, N}g/ . The match method will return an array containing substrings with length of N characters.

How do I split a string into multiple parts?

You can split a string by each character using an empty string('') as the splitter. In the example below, we split the same message using an empty string. The result of the split will be an array containing all the characters in the message string.

How can I split a string into segments of N characters in Java?

Using the String#split Method As the name implies, it splits a string into multiple parts based on a given delimiter or regular expression. As we can see, we used the regex (? <=\\G. {” + n + “}) where n is the number of characters.


3 Answers

The String#split method will split the string by the matched content so it won't include the matched string within the result array.

Use the String#match method with a global flag (g) on your regular expression instead:

var sample="This is a test this is a test"

const re = /\b[\w']+(?:\s+[\w']+){0,2}/g;
const wordList = sample.match(re);
console.log(wordList);

Regex explanation here.

like image 125
Pranav C Balan Avatar answered Sep 18 '22 16:09

Pranav C Balan


As an alternate approach, you can split string by space and the merge chunks in batch.

function splitByWordCount(str, count) {
  var arr = str.split(' ')
  var r = [];
  while (arr.length) {
    r.push(arr.splice(0, count).join(' '))
  }
  return r;
}

var a = "This is a test this is a test";
console.log(splitByWordCount(a, 3))
console.log(splitByWordCount(a, 2))
like image 41
Rajesh Avatar answered Sep 18 '22 16:09

Rajesh


your code is good to go. but not with split. split will treat it as a delimitor. for instance something like this:

var arr = "1, 1, 1, 1";
arr.split(',') === [1, 1, 1, 1] ;
//but 
arr.split(1) === [', ', ', ', ', ', ', '];

Instead use match or exec. like this

var x = "This is a test this is a test";
var re = /\b[\w']+(?:[^\w\n]+[\w']+){0,2}\b/g
var y = x.match(re);
console.log(y);
like image 30
RizkiDPrast Avatar answered Sep 20 '22 16:09

RizkiDPrast