Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript - Lists all the whitepaces indexes using REGex

I want to list all the whitespace index into an array.

I try this:

<script>
    var waw  
    var text1="Saya cinta bahasa java"
    var waw= text1.search(/\s/g)
    alert(waw)
    </script>

Fail. It only show the first whitespace index. While what I need to do is to list all of them into an array variable like waw.

It should be:

waw[0]= 4
waw[1]= 10
waw[2]= 17
like image 589
The Mr. Totardo Avatar asked Nov 30 '25 06:11

The Mr. Totardo


2 Answers

To get the array of indices of each whitespace character in a string, you just need to iterate through all the \s pattern matches in a string, obtain a match and check the regex lastIndex property. No need to replace anything, or using a callback function.

var waw = [];
re = /\s/g;
var text1="Saya cinta bahasa java";
while((m = re.exec(text1)) !== null) {
  waw.push(re.lastIndex - m.length);
}
document.write(JSON.stringify(waw)); // => [4,10,17]

Note that in JavaScript, \s only matches ASCIII whitespace.

Here is a list of Unicode whitespace (see Unicode Character Categories):

Separator, Spaces \p{Zs}:

U+0020 SPACE
U+00A0 NO-BREAK SPACE
U+1680 OGHAM SPACE MARK
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE

Separator, Line \p{Zl}:

U+2028 LINE SEPARATOR

Separator, Paragraph \p{Zp}:

U+2029 PARAGRAPH SEPARATOR

So, you can get all whitespace with the following regex:

var re = /[\s\u00A0\u1680\u2000-\u200A\u202F\u205F\u3000\u2028\u2029]/g;
like image 147
Wiktor Stribiżew Avatar answered Dec 02 '25 20:12

Wiktor Stribiżew


I don't know what is the purpuse of this, but where's a way of doing it.

the replace method does not modify the original string, so just call it and pass a callback function, it will be called on every match, and you get as arguments the match, index, and capturing groups.

So I just pushed all indexes to an array.

var text1 = "Saya cinta bahasa java"

var indexes = [];

text1.replace(/\s/g, function(m, i) {
  console.log(i);
  indexes.push(i);
});

document.body.innerHTML = indexes;
like image 23
Vitim.us Avatar answered Dec 02 '25 20:12

Vitim.us



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!