I have a string which I'd like to split into items contained in an array as the following example:
var text = "I like grumpy cats. Do you?"
// to result in:
var wordArray = ["I", " ", "like", " ", "grumpy", " ", "cats", ".", " ", "Do", " ", "you", "?" ]
I've tried the following expression (and a similar varieties without success
var wordArray = text.split(/(\S+|\W)/)
//this disregards spaces and doesn't separate punctuation from words
In Ruby there's a Regex operator (\b) that splits at any word boundary preserving spaces and punctuation but I can't find a similar for Java Script. Would appreciate your help.
The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.
To split a string with space as delimiter in Java, call split() method on the string object, with space " " passed as argument to the split() method. The method returns a String Array with the splits as elements in the array.
Answer: You just have to pass (“”) in the regEx section of the Java Split() method. This will split the entire String into individual characters.
In that case, the split() method returns an array with the entire string as an element. In the example below, the message string doesn't have a comma (,) character.
Use String#match
method with regex /\w+|\s+|[^\s\w]+/g
.
\w+
- for any word match\s+
- for whitespace[^\s\w]+
- for matching combination of anything other than whitespace and word character.var text = "I like grumpy cats. Do you?";
console.log(
text.match(/\w+|\s+|[^\s\w]+/g)
)
Regex explanation here
FYI : If you just want to match single special char then you can use \W
or .
instead of [^\s\w]+
.
The word boundary \b
should work fine.
Example
"I like grumpy cats. Do you?".split(/\b/)
// ["I", " ", "like", " ", "grumpy", " ", "cats", ". ", "Do", " ", "you", "?"]
Edit
To handle the case of .
, we can split it on [.\s]
as well
Example
"I like grumpy cats. Do you?".split(/(?=[.\s]|\b)/)
// ["I", " ", "like", " ", "grumpy", " ", "cats", ".", " ", "Do", " ", "you", "?"]
(?=[.\s]
Positive look ahead, splits just before .
or \s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With