If I have the following List:
List<String> list = Arrays.asList("hello", "world", "hello");
And I apply the following (Java8):
list.stream().distinct().collect(Collectors.toString());
Then I would get a list with "hello" and "world".
However, in my case, I have a list of a type (from an external api) where I want to "bypass" the equals Method, ideally with a comparator, as it doesn't cover what I need.
Assume this class looks like this:
public class Point {
float x;
float y;
//getters and setters omitted
}
In this case, I would like two points that cover a certain criteria to be defined as equal, for instance (30, 20) and (30.0001, 19.999).
A custom comparator could do the trick, but I have found no API that does what the distinct() in Java8 Stream does, but with a comparator (or similar pattern).
Any thoughts? I know I could write such a function, but I would rather like the elegant way of using existing apis... I have no restriction with external libraries (guava, apache-commons, etc. are welcome if they have a comfortable way of doing what I need).
HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Streams don't support hashing strategies but Eclipse Collections does. It has sets and maps that support hashing strategies as well as overloads of methods like distinct()
that take hashing strategies.
This would work well for Strings. For example, here's how we could get all distinct Strings ignoring case.
MutableList<String> strings = Lists.mutable.with("Hello", "world", "HELLO", "World");
assertThat(
strings.distinct(HashingStrategies.fromFunction(String::toLowerCase)),
is(equalTo(Lists.immutable.with("Hello", "world"))));
Or you can write the hashing strategy by hand to avoid garbage creation.
HashingStrategy<String> caseInsensitive = new HashingStrategy<String>()
{
@Override
public int computeHashCode(String string)
{
int hashCode = 0;
for (int i = 0; i < string.length(); i++)
{
hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
}
return hashCode;
}
@Override
public boolean equals(String string1, String string2)
{
return string1.equalsIgnoreCase(string2);
}
};
assertThat(
strings.distinct(caseInsensitive),
is(equalTo(Lists.immutable.with("Hello", "world"))));
This could work for Points too, but only if you can group all points within non-overlapping regions to have the same hashcode. If you're using a Comparator defined to return 0 when two Points are close enough, then you can run into transitivity problems. For example, Points A, B, and C can fall along a line with A and C both close to B but far from each other. Still, if this is a useful concept to you, we'd welcome a pull request adding ListIterable.distinct(Comparator)
to the API.
Note: I am a committer for Eclipse Collections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With