I need a way to sort a collection of objects based on the properties of a third object. I will try to describe it using a simplified case.
Suppose we have a Person object
class Person {
String firstName;
String lastName;
...
}
And we would like to sort a collection of Persons relative to a certain person. For example: John Doe is the person we want to find, or if we can't find we want the most 'similar' one to be at the top of the sorted collection.
Similarity is defined as follows: If only the first name matches, then it is a better match then when only the last name matches. Ofcourse if both match, it's a bingo.
I came up with a solution, but I am not sure if it is flawless. The idea is to use a Comparator like the following:
public static class PersonComparator implements Comparator<Person> {
String firstName;
String lastName;
public PersonComparator(String firstName, String lastName) {
this.firstName = firstName;
this.lastName = lastName;
}
@Override
public int compare(Person p1, Person p2) {
int p1Match = calcMatch(p1);
int p2Match = calcMatch(p2);
int result = p1Match - p2Match;
if (result == 0) {
//not very sure about this part
result = p1.firstName.compareTo(p2.firstName);
if (result == 0) {
result = p1.lastName.compareTo(p2.lastName);
}
}
return result;
}
public int calcMatch(Person p) {
StringBuilder builder = new StringBuilder();
builder.append(firstName.equals(p.firstName) ? "1" : "0");
builder.append(lastName.equals(p.lastName) ? "1" : "0");
return Integer.parseInt(builder.toString(), 2);
}
}
Thus if the firstname of Person one would match and lastname wouldn't, he would get the binary match '10' translated into integer 2, while if Person two's first and lastnames both match, the binary value will be '11' translated into 3. compareTo would then simply return 2 - 3 = -1 indicating that one is 'less' then two.
However, what to do if both person's first and last name's both don't match the one we are looking for. The matched 'binary value' will be the same, and returning 0 will indicate that the two person's are equal to each other (at least to a TreeSet for example). When such a comparator is used in a TreeSet, only one of the two persons will last in the resulted set.
This is not the desired behaviour, therefore in the case that both person's result in the same match value, I calculate the value to be returned by compareTo based on the fields comparison of the two persons.
Running the following simple test case shows an example:
public static void main(String[] args) {
List<Person> persons = new ArrayList<Person>();
persons.add(new Person("Pietje", "Puk"));
persons.add(new Person("Jan", "Jansen"));
persons.add(new Person("John", "Doe"));
Comparator<Person> comparator = new PersonComparator("John", "Doe")
int firstCompare = comparator.compare(persons.get(0), persons.get(1));
int secondCompare = comparator.compare(persons.get(1), persons.get(2));
int thirdCompare = comparator.compare(persons.get(0), persons.get(2));
System.out.println(firstCompare + " vs " + secondCompare + " vs " + thirdCompare);
TreeSet<Person> personsSet = new TreeSet<Person>(comparator);
personsSet.addAll(persons);
personsSet.add(new Person("Baby", "Doe"));
personsSet.add(new Person("John", "Roe"));
personsSet.add(new Person("Jane", "Doe"));
int i = 0;
for (Person person : personsSet) {
System.out.println(i++ + ") " + person + " [" + comparator.calcMatch(person) + "]");
}
}
executing the code above results in:
6 vs -3 vs -3
0) Jan Jansen [0]
1) Pietje Puk [0]
2) Baby Doe [1]
3) Jane Doe [1]
4) John Roe [2]
5) John Doe [3]
Where the first comparison was based on the firstname (Pietje Puk vs Jan Jansen) and resulted in 6. The second comparison was based on the lastname compared to the pivot (Jan Jansen vs John Doe) and resulted in -3 while the last one was also based on the lastname compared to the pivot (Pietje Puk vs John Doe) and resulted in -3 as well.
As commented in the code, I am not sure about the solution to the problem in compareTo where both fields match similarly, but have different values. Since the 'match' code always calculates a value from 0 to 3, the 'field comparison' can have much higher values and I am not sure if 'mixing' those numbers is a good idea.
Has anyone faced with a similar problem or can confirm that my solution is conform the contracts and has no flaws? Ideally I would like to have a comparator that can be used by a TreeSet, thus compareTo should only return a 0 if the persons are really not equal.
Another solution I have though of is to put the 'pivot' as a 'normal' "Person" object in the treeset, and use a simple comparator based on the fields of the two persons provided to the compareTo method. Once the collection is sorted, I can search for the pivot object and then I know that elements nearby it have the highest match. This solution however doesn't sound really elegant and might not always be applicable.
If you take each of the two first names and two last names matching as independent boolean values, that gives four variables with 24 = 16 combinations. You could check each of those 16 combinations right in your compare method.
public int compare(Person p1, Person p2) {
boolean f1 = p1.firstName.equals(firstName));
boolean f2 = p2.firstName.equals(firstName));
boolean l1 = p1.lastName .equals(lastName));
boolean l2 = p2.firstName.equals(lastName));
if ( f1 && f2 && l1 && l2) { return 0; }
if ( f1 && f2 && l1 && !l2) { return +1; }
if ( f1 && f2 && !l1 && l2) { return -1; }
if ( f1 && f2 && !l1 && !l2) { return p1.lastName.compareTo(p2.lastName); }
if ( f1 && !f2 && l1 && l2) { return +1; }
if ( f1 && !f2 && l1 && !l2) { return +1; }
if ( f1 && !f2 && !l1 && l2) { return +1; }
if ( f1 && !f2 && !l1 && !l2) { return +1; }
if (!f1 && f2 && l1 && l2) { return -1; }
if (!f1 && f2 && l1 && !l2) { return -1; }
if (!f1 && f2 && !l1 && l2) { return -1; }
if (!f1 && f2 && !l1 && !l2) { return -1; }
if (!f1 && !f2 && l1 && l2) { return p1.firstName.compareTo(p2.firstName); }
if (!f1 && !f2 && l1 && !l2) { return +1; }
if (!f1 && !f2 && !l1 && l2) { return -1; }
if (!f1 && !f2 && !l1 && !l2) { return p1.firstName.compareTo(p2.firstName); }
}
If you then combine some of the similar cases, you can reduce this to a more meaningful set of checks:
public int compare(Person p1, Person p2) {
boolean f1 = p1.firstName.equals(firstName));
boolean f2 = p2.firstName.equals(firstName));
boolean l1 = p1.lastName .equals(lastName));
boolean l2 = p2.firstName.equals(lastName));
// Same names.
if (f1 && f2 && l1 && l2) { return 0; }
// One name matches and the other doesn't.
if ( f1 && !f2) { return +1; }
if (!f1 && f2) { return -1; }
if ( l1 && !l2) { return +1; }
if (!l1 && l2) { return -1; }
// Both match first or both match last.
if ( f1 && f2) { return p1.lastName .compareTo(p2.lastName); }
if ( l1 && l2) { return p1.firstName.compareTo(p2.firstName); }
// Completely different names. Sort based on first name.
return p1.firstName.compareTo(p2.firstName);
}
Your question boils down to this: does the comparator induce an ordering that is total (in the precise mathematical sense) or not?
I believe it does. You first map all values to a range between 0 and 3. This is your most important attribute to sort on, so you test it first. Now if they are different, you use integer difference to indicate the ordering which is "totally" fine. If they are the same you start ordering lexicographically first by first name then by last name. The lexicographic ordering is of course total. So you are fine again.
As said in other answers nothing else matters. You do not have to worry about the actual size of the int returned by the comparator.
What does very much matter, but you do not show here, is that your equals method on Person should return true if and only if compareTo returns 0. Your compareTo method can only return 0 if both Persons have the same first name and last name. So if that is true, then equals should also do that. Check that. Ok. Then the other direction. Check there are no more other occasions in which your equals returns 0. Done.
Finally, if you do not trust your reasoning there exists a reasonably good way of testing. Create a random person generator, generate pairs and triples of persons and test the rules for a total ordering for millions of combinations. I.e. if a < b then !(b < a), etc. If we did miss something, chances are a few runs of this setup will point out the flaws in our reasoning.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With