Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java word count program

Tags:

java

I am trying to make a program on word count which I have partially made and it is giving the correct result but the moment I enter space or more than one space in the string, the result of word count show wrong results because I am counting words on the basis of spaces used. I need help if there is a solution in a way that no matter how many spaces are I still get the correct result. I am mentioning the code below.

public class CountWords 
{
    public static void main (String[] args)
    {

            System.out.println("Simple Java Word Count Program");

            String str1 = "Today is Holdiay Day";

            int wordCount = 1;

            for (int i = 0; i < str1.length(); i++) 
            {
                if (str1.charAt(i) == ' ') 
                {
                    wordCount++;
                } 
            }

            System.out.println("Word count is = " + wordCount);
    }
}
like image 240
Kenneth John Falbous Avatar asked Nov 12 '11 05:11

Kenneth John Falbous


People also ask

How many words are in a Java string program?

print("Input the string: "); String str = in. nextLine(); System. out. print("Number of words in the string: " + count_Words(str)+"\n"); } public static int count_Words(String str) { int count = 0; if (!(


4 Answers

public static void main (String[] args) {

     System.out.println("Simple Java Word Count Program");

     String str1 = "Today is Holdiay Day";

     String[] wordArray = str1.trim().split("\\s+");
     int wordCount = wordArray.length;

     System.out.println("Word count is = " + wordCount);
}

The ideas is to split the string into words on any whitespace character occurring any number of times. The split function of the String class returns an array containing the words as its elements. Printing the length of the array would yield the number of words in the string.

like image 169
A Null Pointer Avatar answered Oct 12 '22 23:10

A Null Pointer


Two routes for this. One way would be to use regular expressions. You can find out more about regular expressions here. A good regular expression for this would be something like "\w+" Then count the number of matches.

If you don't want to go that route, you could have a boolean flag that remembers if the last character you've seen is a space. If it is, don't count it. So the center of the loop looks like this:

boolean prevCharWasSpace=true;
for (int i = 0; i < str1.length(); i++) 
{
    if (str1.charAt(i) == ' ') {
        prevCharWasSpace=true;
    }
else{
        if(prevCharWasSpace) wordChar++;
        prevCharWasSpace = false;

    }
}

Update
Using the split technique is exactly equivalent to what's happening here, but it doesn't really explain why it works. If we go back to our CS theory, we want to construct a Finite State Automa (FSA) that counts words. That FSA may appear as:
enter image description here
If you look at the code, it implements this FSA exactly. The prevCharWasSpace keeps track of which state we're in, and the str1.charAt('i') is decideds which edge (or arrow) is being followed. If you use the split method, a regular expression equivalent of this FSA is constructed internally, and is used to split the string into an array.

like image 23
heneryville Avatar answered Oct 13 '22 00:10

heneryville


Java does have StringTokenizer API and can be used for this purpose as below.

String test = "This is a test app";
int countOfTokens = new StringTokenizer(test).countTokens();
System.out.println(countOfTokens);

OR

in a single line as below

System.out.println(new StringTokenizer("This is a test app").countTokens());

StringTokenizer supports multiple spaces in the input string, counting only the words trimming unnecessary spaces.

System.out.println(new StringTokenizer("This    is    a test    app").countTokens());

Above line also prints 5

like image 24
sunil Avatar answered Oct 13 '22 01:10

sunil


You can use String.split (read more here) instead of charAt, you will get good results. If you want to use charAt for some reason then try trimming the string before you count the words that way you won't have the extra space and an extra word

like image 31
Vamsi Avatar answered Oct 13 '22 00:10

Vamsi