Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 - Remove repeated sequence of elements from a List

I've a requirement where I would like to use the Java Stream Api to process a stream of events from a system and apply a data cleanup process to remove repeated events. This is removing the same event repeated multiple times in sequence, not creating a list of distinct events. Most of the Java Stream api examples available online target creating a distinct output from a given input.

Example, for input stream

[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]

the output List or Stream should be

[a, b, c, a, d, c, e, f]

My current implementation (not using Stream api) looks like

public class Test {
    public static void main(String[] args) {
        String fileName = "src/main/resources/test.log";
        try {
            List<String> list = Files.readAllLines(Paths.get(fileName));
            LinkedList<String> acc = new LinkedList<>();

            for (String line: list) {
                if (acc.isEmpty())
                    acc.add(line);
                else if (! line.equals(acc.getLast()) )
                    acc.add(line);
            }

            System.out.println(list);
            System.out.println(acc);

        } catch (IOException ioe) {
            ioe.printStackTrace();
        }
    }
}

Output,

[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
[a, b, c, a, d, c, e, f]

I've tried various example with reduce, groupingBy, etc., without success. I can't seem to find a way to compare a stream with the last element in my accumulator, if there is such a possibilty.

like image 813
Amitoj Avatar asked Jan 16 '17 11:01

Amitoj


People also ask

How do I remove duplicates from a list of objects?

Removed the duplicate object from the users list by calling the distinct method. The distinct method will internally call the equals method of the user object to check if two objects are the same. The distinct method returns a stream of distinct objects.

How HashSet remove duplicates from a list?

Set implementations in Java has only unique elements. Therefore, it can be used to remove duplicate elements. HashSet<Integer>set = new HashSet<Integer>(list1); List<Integer>list2 = new ArrayList<Integer>(set); Above, the list2 will now have only unique elements.


3 Answers

Another concise syntax would be

AtomicReference<Character> previous = new AtomicReference<>(null);
Stream.of('a', 'b', 'b', 'a').filter(cur -> !cur.equals(previous.getAndSet(cur)));
like image 144
Abhinav Atul Avatar answered Oct 15 '22 01:10

Abhinav Atul


You might use a custom Collector to achieve your goal. Please find details below:

Stream<String> lines =  Files.lines(Paths.get("distinct.txt"));
LinkedList<String> values = lines.collect(Collector.of(
            LinkedList::new,
            (list, string) -> {
                if (list.isEmpty())
                    list.add(string);
                else if (!string.equals(list.getLast()))
                    list.add(string);
            },
            (left, right) -> {
                left.addAll(right);
                return left;
            }
    ));

values.forEach(System.out::println);

However it might have some issues when parallel stream is used.

like image 40
Anton Balaniuc Avatar answered Oct 15 '22 01:10

Anton Balaniuc


You can use IntStream to get hold of the index positions in the List and use this to your advantage as follows :

List<String> acc = IntStream
            .range(0, list.size())
            .filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list
                    .get(i + 1))) || i == list.size() - 1))
            .mapToObj(i -> list.get(i)).collect(Collectors.toList());
System.out.println(acc);

Explanation

  1. IntStream.range(0,list.size()) : Returns a sequence of primitive int-valued elements which will be used as the index positions to access the list.
  2. filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list.get(i + 1) || i == list.size() - 1)) : Proceed only if the element at current index position is not equal to the element at the next index position or if the last index position is reached
  3. mapToObj(i -> list.get(i) : Convert the stream to a Stream<String>.
  4. collect(Collectors.toList()) : Collect the results in a List.
like image 23
Chetan Kinger Avatar answered Oct 15 '22 01:10

Chetan Kinger