Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count groups of integers within arrays, without sorting the array?

My goal is to be able to count groups of the same integer within an array. For example, in an array like so, {1, 1, 1, 2, 2, 3, 1, 1}, there are 4 groups:

  • of at least size 1: 3 groups
  • of at least size 2: 1 group
  • of at least size 3

I'm having issues accomplishing this without sorting the array. When it gets sorted I lose count of the group of two 1's at the end of the array, since it gets put next to the other groups of 1's.

int result = (int) Stream.of(1, 1, 1, 2, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 4, 4, 4, 6)
        .collect(Collectors.groupingBy(i -> i))
        .entrySet().stream()
        .filter(entry -> entry.getValue().size() >= 1) // specify the size
        .count()

return result;

This expected output for each size is as follows:

size 1 count == 8

size 2 count == 5

size 6 count == 1

size 8 count == 1

The actual output is as follows:

size 1 count == 6

size 2 count == 3

size 6 count == 2

size 8 count == 1

The difference is a result of the array being sorted prior to the count taking place. Is there any way to accomplish this?

Edit: A group is essentially any place where the same integer repeats, until an integer of a different value, is ahead of it; so, the groups of size 2 in this in this code are any at index 0 to 2(inclusive), index 4 - 5(inclusive), index 6 - 15(inclusive, index 16 - 18(inclusive, and index 20 -22(inclusive). Since there are 5 groups that are of at least size 2 a count of 5 should be returned.

An imperative style of code for my goal.

Scanner key = new Scanner("1 1 1 2 1 1 3 3 3 3 3 3 3 3 3 3 4 4 4 5 4 4 4 6");

        int cnt = 0;
        int counter = 0;
        int i = 0;

            while(key.hasNextInt()) {   
                int next = key.nextInt();

                if(next == array[i]) {
                    counter++;
                }

                if(i + 1 < array.length && i -1 >= 0 
                   && counter >=size  
                   && next != array[i + 1] 
                   && next == array[i-size + 1]) {
                    cnt++;
                    counter = 0;
                }
                i++;
            }
        return cnt;

The expected return for this is the same as above.

The actual return is:

size 1 count == 7

size 2 count == 5

size 6 count == 3

size 8 count == 1

The problem with this loop is I believe it is skipping first piece and the end piece of the array.

I'm not having the same sorting issue as the Stream way.

Ideally, this wouldn't require any external utilities/libraries.

like image 523
Harsh Patel Avatar asked Mar 30 '19 17:03

Harsh Patel


3 Answers

First I would suggest to find all subgroups. For this you can use Stream.collect() with a custom collector:

List<List<Integer>> sublists = IntStream.of(1, 1, 1, 2, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 4, 4, 4, 6)
        .collect(ArrayList::new, (lists, value) -> {
            if (lists.isEmpty() || lists.get(lists.size() - 1).stream().noneMatch(e -> e == value)) {
                lists.add(new ArrayList<>());
            }
            lists.get(lists.size() - 1).add(value);
        }, (l1, l2) -> {
            throw new RuntimeException("not supported for parallel streams");
        });

The result is:

[[1, 1, 1], [2], [1, 1], [3, 3, 3, 3, 3, 3, 3, 3, 3, 3], [4, 4, 4], [5], [4, 4, 4], [6]]

Now you can use this to group the list sizes:

Map<Integer, Long> result = sublists.stream()
        .collect(Collectors.groupingBy(List::size, Collectors.counting()));
result.forEach((size, count) -> System.out.println(String.format("size %s count %s", size, count)));

This finds all existing group sizes and prints:

size 1 count 3
size 2 count 1
size 3 count 3
size 10 count 1

To count all groups with a min length you can use:

Map<Integer, Long> result = IntStream.rangeClosed(1, sublists.stream().mapToInt(List::size).max().orElse(0)).boxed()
        .collect(Collectors.toMap(Function.identity(), i -> sublists.stream().filter(l -> l.size() >= i).count()));
result.forEach((size, count) -> System.out.println(String.format("size %s count %s", size, count)));

This prints:

size 1 count 8
size 2 count 5
size 3 count 4
size 4 count 1
size 5 count 1
size 6 count 1
size 7 count 1
size 8 count 1
size 9 count 1
size 10 count 1

To get only a predefined set of sizes (e. g. 1, 2, 6, 8) you can modify the last solution:

Map<Integer, Long> result = IntStream.of(1, 2, 6, 8).boxed()
        .collect(Collectors.toMap(Function.identity(), i -> sublists.stream().filter(l -> l.size() >= i).count()));
result.forEach((size, count) -> System.out.println(String.format("size %s count %s", size, count)));

The result of this is:

size 1 count 8
size 2 count 5
size 6 count 1
size 8 count 1
like image 132
Samuel Philipp Avatar answered Oct 27 '22 12:10

Samuel Philipp


If using StreamEx is an option it get's quite easy

IntStream s = IntStream.of(1, 1, 1, 2, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 4, 4, 4, 6);
List<List<Integer>> t = IntStreamEx.of (s).boxed().groupRuns(Integer::equals).toList();
t.forEach(e -> {
  System.out.println(String.format("Group of %s - %s times", e.get(0), e.size()));
});
like image 40
baao Avatar answered Oct 27 '22 14:10

baao


First of all, let me start by saying this is not really what the Stream API is made for but however, it is possible, in maybe not the most elegant way, but it is nonetheless possible.

Here is a possible solution for you to use

  1. Convert everything to a big string and split it where number get different.
  2. Now stream through the stream and collect the groups and their count
  3. Add the number of higher groups to each of the smaller ones (to get the at least of logic)
  4. There, you should have all the info you needed

Demo

String[] groups = Stream.of(1, 1, 1, 2, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 5, 4, 4, 4, 6)
        .map(String::valueOf)
        .collect(Collectors.joining(","))
        .split("(?<=(\\d))(,)(?!\\1)");

Map<Integer, Long> groupedByGroupSizes = Arrays.stream(groups)
        .map(group -> group.split(",").length)
        .collect(Collectors.groupingBy(x -> x, Collectors.counting()));

TreeMap<Integer, Long> integerLongTreeMap = new TreeMap<>(groupedByGroupSizes);
int size = integerLongTreeMap.size();

for (Integer integer : integerLongTreeMap.keySet()) {
    Long value = integerLongTreeMap.get(integer);
    integerLongTreeMap.put(integer, value + --size);
}

integerLongTreeMap.entrySet().forEach(entry -> System.out.println(String.format("of at least size %s: %s groups", entry.getKey(), entry.getValue())));

Prints

of at least size 1: 6 groups
of at least size 2: 3 groups
of at least size 3: 4 groups
of at least size 10: 1 groups
like image 25
Yassin Hajaj Avatar answered Oct 27 '22 12:10

Yassin Hajaj