Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression for accurate word-count using JavaScript

I'm trying to put together a regular expression for a JavaScript command that accurately counts the number of words in a textarea.

One solution I had found is as follows:

document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\b\w+\b/).length -1;

But this doesn't count any non-Latin characters (eg: Cyrillic, Hangul, etc); it skips over them completely.

Another one I put together:

document.querySelector("#wordcount").innerHTML = document.querySelector("#editor").value.split(/\s+/g).length -1;

But this doesn't count accurately unless the document ends in a space character. If a space character is appended to the value being counted it counts 1 word even with an empty document. Furthermore, if the document begins with a space character an extraneous word is counted.

Is there a regular expression I can put into this command that counts the words accurately, regardless of input method?

like image 486
木川 炎星 Avatar asked Jan 04 '11 12:01

木川 炎星


People also ask

How do you evaluate a regular expression in JavaScript?

The RegExp test() Method in JavaScript is used to test for match in a string. If there is a match this method returns true else it returns false. Where str is the string to be searched. This is required field.

How do you count in regular expressions?

To count a regex pattern multiple times in a given string, use the method len(re. findall(pattern, string)) that returns the number of matching substrings or len([*re. finditer(pattern, text)]) that unpacks all matching substrings into a list and returns the length of it as well.


1 Answers

This should do what you're after:

value.match(/\S+/g).length;

Rather than splitting the string, you're matching on any sequence of non-whitespace characters.

There's the added bonus of being easily able to extract each word if needed ;)

like image 51
David Tang Avatar answered Oct 02 '22 18:10

David Tang