Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java 8 Distinct by property

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object?

For example I have a list of Person object and I want to remove people with the same name,

persons.stream().distinct(); 

Will use the default equality check for a Person object, so I need something like,

persons.stream().distinct(p -> p.getName()); 

Unfortunately the distinct() method has no such overload. Without modifying the equality check inside the Person class is it possible to do this succinctly?

like image 506
RichK Avatar asked May 16 '14 15:05

RichK


People also ask

How do I get unique values from a collection stream?

distinct() returns a stream consisting of distinct elements in a stream. distinct() is the method of Stream interface. This method uses hashCode() and equals() methods to get distinct elements. In case of ordered streams, the selection of distinct elements is stable.

How do you make a List distinct in Java?

We'll use the distinct() method from the Stream API, which returns a stream consisting of distinct elements based on the result returned by the equals() method. There we have it, three quick ways to clean up all the duplicate items from a List.

How do I remove duplicates from a List in Java 8?

Remove duplicates in arraylist – Java 8. To remove the duplicates from the arraylist, we can use the java 8 stream api as well. Use steam's distinct() method which returns a stream consisting of the distinct elements comparing by object's equals() method. Collect all district elements as List using Collectors.

How do I remove duplicates from my stream?

You can use the Stream. distinct() method to remove duplicates from a Stream in Java 8 and beyond. The distinct() method behaves like a distinct clause of SQL, which eliminates duplicate rows from the result set.


2 Answers

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:

public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {     Set<Object> seen = ConcurrentHashMap.newKeySet();     return t -> seen.add(keyExtractor.apply(t)); } 

Then you can write:

persons.stream().filter(distinctByKey(Person::getName)) 

Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.

(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)

like image 158
Stuart Marks Avatar answered Oct 02 '22 00:10

Stuart Marks


An alternative would be to place the persons in a map using the name as a key:

persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values(); 

Note that the Person that is kept, in case of a duplicate name, will be the first encontered.

like image 45
wha'eve' Avatar answered Oct 02 '22 00:10

wha'eve'