Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stream and the distinct operation

I have the following code:

class C {     String n;      C(String n)     {         this.n = n;     }      public String getN() { return n; }      @Override     public boolean equals(Object obj)     {         return this.getN().equals(((C)obj).getN());     }  }  List<C> cc = Arrays.asList(new C("ONE"), new C("TWO"), new C("ONE"));  System.out.println(cc.parallelStream().distinct().count()); 

but I don't understand why distinct returns 3 and not 2.

like image 553
xdevel2000 Avatar asked Jan 24 '14 13:01

xdevel2000


People also ask

What does distinct do in stream?

distinct() returns a stream consisting of distinct elements in a stream. distinct() is the method of Stream interface. This method uses hashCode() and equals() methods to get distinct elements. In case of ordered streams, the selection of distinct elements is stable.

What are the stream operations?

Stream operations are divided into intermediate ( Stream -producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy. Possibly unbounded. While collections have a finite size, streams need not.

How many types of operations can we perform on stream?

There are two types of operations in streams, some operations produce another stream as a result and some operations produce non-stream values as a result. So we can say that stream interface has a selection of terminal and non-terminal operations.

How do you make a List distinct in Java?

We'll use the distinct() method from the Stream API, which returns a stream consisting of distinct elements based on the result returned by the equals() method. There we have it, three quick ways to clean up all the duplicate items from a List.


1 Answers

You need to also override the hashCode method in class C. For example:

@Override public int hashCode() {     return n.hashCode(); } 

When two C objects are equal, their hashCode methods must return the same value.

The API documentation for interface Stream does not mention this, but it's well-known that if you override equals, you should also override hashCode. The API documentation for Object.equals() mentions this:

Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

Apparently, Stream.distinct() indeed uses the hash code of the objects, because when you implement it like I showed above, you get the expected result: 2.

like image 132
Jesper Avatar answered Oct 08 '22 15:10

Jesper