In Java 8 how can I filter a collection using the Stream
API by checking the distinctness of a property of each object?
For example I have a list of Person
object and I want to remove people with the same name,
persons.stream().distinct();
Will use the default equality check for a Person
object, so I need something like,
persons.stream().distinct(p -> p.getName());
Unfortunately the distinct()
method has no such overload. Without modifying the equality check inside the Person
class is it possible to do this succinctly?
distinct() returns a stream consisting of distinct elements in a stream. distinct() is the method of Stream interface. This method uses hashCode() and equals() methods to get distinct elements. In case of ordered streams, the selection of distinct elements is stable.
We'll use the distinct() method from the Stream API, which returns a stream consisting of distinct elements based on the result returned by the equals() method. There we have it, three quick ways to clean up all the duplicate items from a List.
Remove duplicates in arraylist – Java 8. To remove the duplicates from the arraylist, we can use the java 8 stream api as well. Use steam's distinct() method which returns a stream consisting of the distinct elements comparing by object's equals() method. Collect all district elements as List using Collectors.
You can use the Stream. distinct() method to remove duplicates from a Stream in Java 8 and beyond. The distinct() method behaves like a distinct clause of SQL, which eliminates duplicate rows from the result set.
Consider distinct
to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) { Set<Object> seen = ConcurrentHashMap.newKeySet(); return t -> seen.add(keyExtractor.apply(t)); }
Then you can write:
persons.stream().filter(distinctByKey(Person::getName))
Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct()
does.
(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)
An alternative would be to place the persons in a map using the name as a key:
persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values();
Note that the Person that is kept, in case of a duplicate name, will be the first encontered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With