Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word Count Algorithm in C#

Tags:

c#

.net

I am looking for a good word count class or function. When I copy and paste something from the internet and compare it with my custom word count algorithm and MS Word it is always off by a little more then 10%. I think that is too much . So do you guys know of an accurate word count algorithm in c#.

like image 589
Luke101 Avatar asked Oct 27 '09 19:10

Luke101


People also ask

What is word count algorithm?

The classic word-count algorithm: given an array of strings, return a Map with a key for each different string, with the value the number of times that string appears in the array.

How do you count words in a string array?

Core Java bootcamp program with Hands on practice Instantiate a String class by passing the byte array to its constructor. Using split() method read the words of the String to an array. Create an integer variable, initialize it with 0, int the for loop for each element of the string array increment the count.


2 Answers

As @astander suggests, you can do a String.Split as follows:

string[] a = s.Split(
    new char[] { ' ', ',', ';', '.', '!', '"', '(', ')', '?' },
    StringSplitOptions.RemoveEmptyEntries);

By passing in an array of chars, you can split on multiple word breaks. Removing empty entries will keep you from counting non-word words.

like image 93
Larsenal Avatar answered Oct 02 '22 08:10

Larsenal


String.Split by predefined chars. Use punctuations, spaces (remove multiple space), and any other chars that you determine to be "word splits"

What have you tried?

I did see that the previous user got nailed for links, but here is some examples of using regex, or char matching. Hope it helps, and nobody gets hurt X-)

String.Split Method (Char[])

Word counter in C#

C# Word Count

like image 28
Adriaan Stander Avatar answered Oct 02 '22 06:10

Adriaan Stander