Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting number of words in C#

Tags:

c#

I'm trying to count the number of words from a rich textbox in C# the code that I have below only works if it is a single line. How do I do this without relying on regex or any other special functions.

string whole_text = richTextBox1.Text;
string trimmed_text = whole_text.Trim();
string[] split_text = trimmed_text.Split(' ');
int space_count = 0;
string new_text = "";

foreach(string av in split_text)
{
    if (av == "")
    {
        space_count++;
    }
    else 
    { 
        new_text = new_text  + av + ",";
    }
}

new_text = new_text.TrimEnd(',');
split_text = new_text.Split(',');
MessageBox.Show(split_text.Length.ToString ());
like image 624
Wern Ancheta Avatar asked Aug 30 '25 18:08

Wern Ancheta


2 Answers

char[] delimiters = new char[] {' ', '\r', '\n' };
whole_text.Split(delimiters,StringSplitOptions.RemoveEmptyEntries).Length;  
like image 195
Bedasso Avatar answered Sep 02 '25 09:09

Bedasso


Since you are only interested in word count, and you don't care about individual words, String.Split could be avoided. String.Split is handy, but it unnecessarily generates a (potentially) large number of String objects, which in turn creates an unnecessary burden on the garbage collector. For each word in your text, a new String object needs to be instantiated, and then soon collected since you are not using it.

For a homework assignment, this may not matter, but if your text box contents change often and you do this calculation inside an event handler, it may be wiser to simply iterate through characters manually. If you really want to use String.Split, then go for a simpler version like Yonix recommended.

Otherwise, use an algorithm similar to this:

int wordCount = 0, index = 0;

// skip whitespace until first word
while (index < text.Length && char.IsWhiteSpace(text[index]))
    index++;

while (index < text.Length)
{
    // check if current char is part of a word
    while (index < text.Length && !char.IsWhiteSpace(text[index]))
        index++;

    wordCount++;

    // skip whitespace until next word
    while (index < text.Length && char.IsWhiteSpace(text[index]))
        index++;
}

This code should work better with cases where you have multiple spaces between each word, you can test the code online.

like image 39
Groo Avatar answered Sep 02 '25 07:09

Groo