Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all numbers in a array with top frequency?

Tags:

java

arrays

I am trying to find out all the numbers with maximum frequency. i.e If maximum frequency is 5 then, I need all numbers that occurred 5 times in a array.

let us consider following example of array:

1 8 7 8 9 2 1 9 6 4 3 5

Here, Most frequent numbers are 8, 1, 9 with highest frequency of 2. My expected output is something like this:

8 => 2
1 => 2
9 => 2

In my project, I am trying to find out most frequent numbers and least frequent numbers. Here I want just most frequent numbers.

I have generated 1000 random numbers similar to my project scenario and have calculated distinct numbers and then their occurrence.

    int n=100;
    int N=1000;

    int data[] = new int[N];
    Set<Integer> set = new HashSet<Integer>();

    Random random = new Random();

    for(int i=0;i<N;i++){
        int  number = random.nextInt(n);
        data[i] = number;
        set.add(number);
    }

    int frequency[] = new int[set.size()];
    Integer[] distinct = set.toArray(new Integer[set.size()]);

    for (int j=0;j<set.size();j++){
        int count=0;
        for(int k=0;k<N;k++){
            if(distinct[j]==data[k]){
                count = count+1;
            }
        }
        frequency[j] = count;
    }

After calculating frequencies of each number, I have calculated numbers with most frequencies using answer from here which is optimized one.

    int max = Integer.MIN_VALUE;
    List<Integer> vals = new ArrayList<>();

    for (int q=0; q < frequency.length; ++q) {

        if (frequency[q] == max) {
            vals.add(q);
        }

        else if (frequency[q] > max) {
            vals.clear();
            vals.add(q);
            max = frequency[q];
        }
    }

    for(int num : vals){
        System.out.println(distinct[num]+" => "+frequency[num]);
    }

Here, Loop in first code making whole process slower. This is only part of large code and sample test case.

I want to make the process faster since there may be large elements in the array in real case.

Anyone have way to optimize those loops ? or some other way to get the result ?

Any kind of help is appreciated.

like image 503
Sagar Gautam Avatar asked Dec 13 '22 22:12

Sagar Gautam


2 Answers

I would use streams for this. It doens’t turn out to be very much shorter, but once you get comfortable with streams, it will be conceptually simpler.

    Map<Integer, Long> frequencies = Arrays.stream(data)
            .boxed()
            .collect(Collectors.groupingBy(i -> i, Collectors.counting()));
    if (frequencies.isEmpty()) {
        System.out.println("No data");
    } else {
        long topFrequency = frequencies.values()
                .stream()
                .max(Long::compareTo)
                .get();
        int[] topNumbers = frequencies.entrySet()
                .stream()
                .filter(e -> e.getValue() == topFrequency)
                .mapToInt(Map.Entry::getKey)
                .toArray();
        for (int number : topNumbers) {
            System.out.println("" + number + " => " + topFrequency);
        }
    }

With the example data from the question it prints the desired (only in another unpredictable order):

1 => 2
8 => 2
9 => 2

Edit: tucuxi asked: why not use the stream to print with too? You may do that of course, for shorter and yet simpler code:

        frequencies.entrySet()
                .stream()
                .filter(e -> e.getValue() == topFrequency)
                .mapToInt(Map.Entry::getKey)
                .forEach(n -> System.out.println("" + n + " => " + topFrequency));

What to choose depends on both requirements and taste. I was expecting that the OP would need to store the top frequency numbers, so I demonstrated how to do that, and just printed them to show the result. Also some hold the ideal that streams should be free from side effects, where I would consider printing to standard output a side effect. But use it if you like.

like image 158
Ole V.V. Avatar answered Dec 26 '22 16:12

Ole V.V.


This code is very inefficient and may run in O(n^2) at the worst case.

You can achieve your goal with a single for loop by building a Map<Integer,Integer> where the key is each unique number you encounter and the value is its frequency.

After you have the Map, it's trivial to find all the numbers having max frequency (just iterate over all the entries of the Map). The total running time will be O(n).

int maxFreq = Integer.MIN_VALUE;
Map<Integer,Integer> freqs = new HashMap<>();
for(int i=0;i<N;i++){
    int number = random.nextInt(n);
    data[i] = number;
    Integer freq = freqs.get(number);
    if (freq != null) {
        freq = freq + 1;
    } else {
        freq = 1;
    }
    freqs.put(number,freq);
    if (freq > maxFreq)
        maxFreq = freq;
}
for(Map.Entry<Integer,Integer> entry : freqs.entrySet()) {
    if (entry.getValue().equals(maxFreq)) {
        System.out.println(entry.getKey() +" => "+ maxFreq);
    }
}
like image 33
Eran Avatar answered Dec 26 '22 16:12

Eran