I've a requirement where I would like to use the Java Stream Api to process a stream of events from a system and apply a data cleanup process to remove repeated events. This is removing the same event repeated multiple times in sequence, not creating a list of distinct events. Most of the Java Stream api examples available online target creating a distinct output from a given input.
Example, for input stream
[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
the output List or Stream should be
[a, b, c, a, d, c, e, f]
My current implementation (not using Stream api) looks like
public class Test {
public static void main(String[] args) {
String fileName = "src/main/resources/test.log";
try {
List<String> list = Files.readAllLines(Paths.get(fileName));
LinkedList<String> acc = new LinkedList<>();
for (String line: list) {
if (acc.isEmpty())
acc.add(line);
else if (! line.equals(acc.getLast()) )
acc.add(line);
}
System.out.println(list);
System.out.println(acc);
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
Output,
[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
[a, b, c, a, d, c, e, f]
I've tried various example with reduce, groupingBy, etc., without success. I can't seem to find a way to compare a stream with the last element in my accumulator, if there is such a possibilty.
Removed the duplicate object from the users list by calling the distinct method. The distinct method will internally call the equals method of the user object to check if two objects are the same. The distinct method returns a stream of distinct objects.
Set implementations in Java has only unique elements. Therefore, it can be used to remove duplicate elements. HashSet<Integer>set = new HashSet<Integer>(list1); List<Integer>list2 = new ArrayList<Integer>(set); Above, the list2 will now have only unique elements.
Another concise syntax would be
AtomicReference<Character> previous = new AtomicReference<>(null);
Stream.of('a', 'b', 'b', 'a').filter(cur -> !cur.equals(previous.getAndSet(cur)));
You might use a custom Collector to achieve your goal. Please find details below:
Stream<String> lines = Files.lines(Paths.get("distinct.txt"));
LinkedList<String> values = lines.collect(Collector.of(
LinkedList::new,
(list, string) -> {
if (list.isEmpty())
list.add(string);
else if (!string.equals(list.getLast()))
list.add(string);
},
(left, right) -> {
left.addAll(right);
return left;
}
));
values.forEach(System.out::println);
However it might have some issues when parallel
stream is used.
You can use IntStream
to get hold of the index positions in the List
and use this to your advantage as follows :
List<String> acc = IntStream
.range(0, list.size())
.filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list
.get(i + 1))) || i == list.size() - 1))
.mapToObj(i -> list.get(i)).collect(Collectors.toList());
System.out.println(acc);
Explanation
IntStream.range(0,list.size())
: Returns a sequence of primitive int-valued elements which will be used as the index positions to access the list.filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list.get(i + 1) || i == list.size() - 1))
: Proceed only if the element at current index position is not equal to the element at the next index position or if the last index position is reachedmapToObj(i -> list.get(i)
: Convert the stream to a Stream<String>
.collect(Collectors.toList())
: Collect the results in a List.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With