Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating frequency of each word in a sentence in java

I am writing a very basic java program that calculates frequency of each word in a sentence so far i managed to do this much

import java.io.*;

class Linked {

    public static void main(String args[]) throws IOException {

        BufferedReader br = new BufferedReader(
            new InputStreamReader(System.in));
        System.out.println("Enter the sentence");
        String st = br.readLine();
        st = st + " ";
        int a = lengthx(st);
        String arr[] = new String[a];
        int p = 0;
        int c = 0;

        for (int j = 0; j < st.length(); j++) {
            if (st.charAt(j) == ' ') {
                arr[p++] = st.substring(c,j);
                c = j + 1;
            }
        }
    }

    static int lengthx(String a) {
        int p = 0;
        for (int j = 0; j < a.length(); j++) {
            if (a.charAt(j) == ' ') {
                p++;
            }
        }
        return p;
    }
}

I have extracted each string and stored it in a array , now problem is actually how to count the no of instances where each 'word' is repeated and how to display so that repeated words not get displayed multiple times , can you help me in this one ?

like image 262
Sigma Avatar asked Feb 14 '14 05:02

Sigma


3 Answers

Use a map with word as a key and count as value, somthing like this

    Map<String, Integer> map = new HashMap<>();
    for (String w : words) {
        Integer n = map.get(w);
        n = (n == null) ? 1 : ++n;
        map.put(w, n);
    }

if you are not allowed to use java.util then you can sort arr using some sorting algoritm and do this

    String[] words = new String[arr.length];
    int[] counts = new int[arr.length];
    words[0] = words[0];
    counts[0] = 1;
    for (int i = 1, j = 0; i < arr.length; i++) {
        if (words[j].equals(arr[i])) {
            counts[j]++;
        } else {
            j++;
            words[j] = arr[i];
            counts[j] = 1;
        }
    }

An interesting solution with ConcurrentHashMap since Java 8

    ConcurrentMap<String, Integer> m = new ConcurrentHashMap<>();
    m.compute("x", (k, v) -> v == null ? 1 : v + 1);
like image 64
Evgeniy Dorofeev Avatar answered Nov 19 '22 04:11

Evgeniy Dorofeev


In Java 8, you can write this in two simple lines! In addition you can take advantage of parallel computing.

Here's the most beautiful way to do this:

Stream<String> stream = Stream.of(text.toLowerCase().split("\\W+")).parallel();

Map<String, Long> wordFreq = stream
     .collect(Collectors.groupingBy(String::toString,Collectors.counting()));
like image 15
Bahul Jain Avatar answered Nov 19 '22 03:11

Bahul Jain


import java.util.*;

public class WordCounter {

    public static void main(String[] args) {

        String s = "this is a this is this a this yes this is a this what it may be i do not care about this";
        String a[] = s.split(" ");
        Map<String, Integer> words = new HashMap<>();
        for (String str : a) {
            if (words.containsKey(str)) {
                words.put(str, 1 + words.get(str));
            } else {
                words.put(str, 1);
            }
        }
        System.out.println(words);
    }
}

Output: {a=3, be=1, may=1, yes=1, this=7, about=1, i=1, is=3, it=1, do=1, not=1, what=1, care=1}

like image 4
AKT Avatar answered Nov 19 '22 03:11

AKT