Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you implement compareTo() in a class with a lot of fields when you only care about a single value?

In the case where we have a class with a lot of fields - let's say for example a BankAccount class:

class BankAccount implements Comparable<BankAccount> {
    Double amount;
    String name;
    Instant activeTime;
    String currency;
    // Insert a lot more fields here
    
    @Override
    public int compareTo(BankAccount other) {
        return amount.compareTo(other.amount);
    }
}

We want to preserve a natural order across all BankAccount instances (so that they're Comparable), but we also want to enforce that the equals() method returns true when compareTo() is zero. How do we write a compareTo() method in this case? I want to say "compare the amount first, and then consistently compare the remaining fields in a manner that I don't care about". If I exclude the other fields from the compareTo() method, then equals() will have inconsistent behavior with compareTo(), but I also don't want to chain comparisons manually across every field within BankAccount. Do I have to use Comparator, or is there a better solution?

It's straight forward enough to either:

  1. Use Comparator instead of implementing Comparable
  2. Violate the equals() and compareTo() methods so that compareTo() returns 0, but equals may or may not return true
  3. Use something like a guava comparison chain to manually list out the remainder of the fields that I'd like to compare.

But I want to implement Comparable in this case. What are my options?

like image 946
Lyee Avatar asked Oct 22 '25 15:10

Lyee


2 Answers

The answer is #1 (Use a Comparator; do not have your BankAccount class implement Comparable), and compose comparators with .thenComparing if you need total ordering.

But first, some explanation as to why only 'supply a Comparator; this class should not implement Comparable' is appropriate:

Semantics - it's not utterly inherent to the concept

implementing Comparable implies something called natural order – an order that is utterly inherent to the very heart of the thing your class is modeling. For example, Integer has this property. Not the class, the concept. Integers as a concept have an order to them so inherently obvious that nobody is ever going to be confused about it. 1 comes before 2. Of course it does. This warrants no explanation.

Your BankAccount simply does not have this.

Hence, stylistically clearly the only correct answer is option 1. It is not inherent to the notion of a bank account that one can order 2 bank accounts based specifically on the balance of them. It is plausible that if one says: "Order these 2 bank accounts" with no further context that one would assume that they must be ordered on the bank account 'id', which your class does not actually have, but most folks' conceptualization of what a bank account is, does.

Even if you disagree with all that, there is a second problem:

Pragmatics, comparisons must be stable and fast

Your bank account class doesn't even have a 'total' ordering (a total ordering is that any given bank account is necessarily always above or below any other, and is comparison-wise at the same level (a.compareTo(b) returns 0) only if a and b are also a.equals(b)). That's because currency is involved.

Quick, order me these 2 bank accounts, please:

BankAccount swiss = new BankAccount(17.0, "Erik Mueller", Instant.now(), "CHF");

BankAccount euro = new BankAccount(18.103, "Jannie Molenaar", someInstant2WeeksAgo, "EUR");

Which one is 'higher'?

That's incredibly difficult to tell. Today, by less than a % difference, the euro account is 'bigger', but in as little time as a single minute that could change; forex is a 24/7/366 kinda deal, and it even depends on which broker you ask.

The obvious conclusion then is to simply decree that any 2 bank accounts cannot be compared on 'how large they are' at all unless the currencies are the same; if they are not, they simply won't compare in the first place. Which is fine, Comparators do not need to totally order and are entirely free to throw an exception (they can not, however, decree that these 2 bank accounts are therefore comparison-wise equal, due to the commutativity requirement! Exception is the only sane way).

An alternative is to simply return whatever bank account has the largest amount regardless of forex (so, the euro one happens to win here, but, say, a bank account with 1000 yen in it 'wins' from both of these even though 1000 yen is worth far less than 17 swiss francs or 18 euros. (it's about 6 euros; 1 yen is worth less than a eurocent!). That clearly cannot possibly be a 'natural order' because it is ridiculous - not obvious at all, and thus does not qualify.

You are technically allowed by the spec to implement Comparable and provide a compareTo operation that will throw an exception on many comparisons as long as it adheres to the 4 rules if no exception occurs, but that's far removed from what folks expect, and more generally, "It compares bank accounts based on the amount in them if the currency is identical; otherwise, exception" is so far removed from the one and only obvious way any 2 bank accounts could possibly be compared, that it's a very bad idea to make the natural order.

Trick 1 - provide it in BankAccount itself

Nevertheless, "I just want to sort this list of BankAccount objects based on their balances, I'll ensure they are all in the same currency" might come up a lot. You might even have this:

public interface ForexRateSource {
  long convert(Currency from, long amount, Currency to) throws CurrencyConversionException;
}

and then have an implementation for that based on e.g. querying the xe.com API or another that just checks a static table, or a third that checks a database. You want the flexibility to generate comparators based on need. Given the above, you can then add:

public class BankAccount {
  /**
   * Orders bank accounts based on their balance.
   *
   * Comparison will fail with an {@code IllegalStateException} if asked to compare 2 accounts using different currencies.
   */
  public static Comparator<BankAccount> byBalance() {
    return (a, b) -> {
      if (a.currency.equals(b.currency)) {
        return Long.compare(a.amount, b.amount);
      }
      throw new IllegalStateException("currency mismatch");
    };
  }

  /**
   * Orders bank accounts based on their balance.
   *
   * Comparison between 2 accounts using different currencies are solved by converting both to EUR and then comparing the result.
   */
  public static Comparator<BankAccount> byBalance(ForexRateSource forex) {
    return (a, b) -> {
      if (a.currency.equals(b.currency)) {
        return Long.compare(a.amount, b.amount);
      }
      long aEur = forex.convert(a.currency, a.amount, "EUR");
      long bEur = forex.convert(b.currency, b.amount, "EUR");
      return Long.compare(aEur, bEur);
    };
  }
}

This way any user of your BankAccount API that needs a thing can just ask BankAccount to provide it. You can add on a comparator that compares other things, such as one that compares bank account ids, or that compares last access time, or whatever you need.

Composing

You might want a comparator that is nevertheless total, simply because you want to e.g. shove bank accounts as keys in a treemap and you don't want 2 bank accounts that aren't equal but happen to have the same balance to exclude each other - treemap requires total ordering.

You can make more methods but instead, 'compose' your comparators. You can combine 2 comparators using the .thenComparing method, which comparators have, and lets you say 'if comparing 2 things with this comparator says they are on equal footing, then move on to this other comparator'. You can use this to compose multiple comparators: You construct 'compare balance first, but if they are equal, compare on account ID instead' based on .thenComparing to connect a comparator that just compares balances with one that just compares account ids.

Big fat warning 1 - double is a big nono.

You do not want to use double for currencies, it is dangerous to do so. It initially looks like it works but it will break in crazy ways, shaving off or creating cents left and right.

Here is a trivial example and you really really need to run this to fully understand why you must never ever ever use double to represent currency stuff:

double v = 0.0;
for (int i = 0; i < 10; i++) v += 0.1;
System.out.println("10 * 0.1 is 1.0.. of course it is. Right? " + (v == 1.0));

And then gasp in sheer surprise when that prints.. false. It's not.

Here's a simple enough way to accurately understand why this does not work: Write down one third on a card. In decimal. Go ahead. I'll wait.

Wrong. You messed it up. It is, in fact, impossible. 0.333333 - you do not have enough space no matter how large that card is to write it out 'correctly'. You have to round somewhere to make this work.

The only numbers that ever 'work' is if the divisor (here, 3, I asked you to write down 1/3), if broken down into prime factors, has only twos and fives as factors. For example, 1/200 works, because 200 is 2*2*2*5*5. And it's 'only twos and fives' because it's a decimal system - base 10, and 2 and 5 are the prime factors of 10.

Computers count in binary, not decimal. Hence, Only numbers whose divisor is a power of 2 can be perfectly represented. These are the only ones that 'do not run out of space'. Change the above code but make that for (int i = 0; i < 8; i++) v += .125; and all of a sudden it does print true. That's because 8 'works'; that's 2*2*2.

The problem is, currencies are represented in base 100, and 100 breaks down into 2*2*5*5 and thus, 1/100th is exactly like 1/3 to a computer - not perfectly representable.

So, what happens? CPU rounds, silently. You really really do not want this.

Solution to the rounding thing

All currencies have a well defined atomic unit. For euros, its eurocents. For yen it's just yen. For bitcoin, it's satoshis. There's always one. Represent your stuff in that. Yes, you now cannot divide 4 cents amongst 3 people but that's correct; even if somehow you cook up a crazy scheme where you can store the notion of 'one third of a cent' somehow, there is no bank that will let you transfer a third of a cent. You're forced into thinking how you round things and that is a good thing because you have to.

A much more convoluted alternative is to use BigDecimal. It doesn't give you much more power (you still can't divide 4 cents amongst 3 accounts with BigDecimal), but it does not silently round, it can perfectly represent 1/100th. It can even come in a bit handy with foreign exchange, as it can accurately multiply any amount with an exchange rate with any amount of precision (but then you have to think about how to round it back down, a bank still does not let you transfer 14.000058129 cents, after all).

Big fat warning 2 - consistency

The forex schemed comparator cannot actually be used by a TreeMap without causing problems, because treemap demands consistency. If a comes before b when you add a and b to the map, then it has to still be true later on, or all heck breaks loose and your map straight up starts lying to you (you iterate through it and some account is in there, then you ask it for the value associated with that account and the map says that account is NOT in there). Hence, that forex rate provider thing cannot ever update or it would break the map.

This even further highlights how incredibly tricky it is going to get if you attempt to store bank accounts 'sorted on total value' in the face of foreign exchange rates. That simply isn't possible at all.

Even without forex, it's complicated: You can't put a bank account into a treemap that sorts on balance, and then update the balance, because treemap will not magically figure out it needs to 're-order' that bank account because now has a different value. You'd have to take it out of the map, modify it, and then put it back into the map.

Using any sort of comparison stuff (be it Comparable or Comparator) is smooth sailing as long as ordering never ever changes - i.e. the ordering is based on fields that are all final, whose values are immutable, and the comparator code does not involve e.g. foreign exchange conversion APIs that change over time.

If that's not true, it gets complicated. Most things you might wanna do with them (such as use them as the comparator powering a TreeMap) do not work and cannot be made to work.

Adhere to the rules

As per the docs; adhere to these rules. Failure to do so means e.g. TreeMap will start lying to you and act in bizarre, unspecified fashion:

  1. a.compareTo(a) must return 0.
  2. If a.compareTo(b) < 0 and b.compareTo(c) < 0, then a.compareTo(c) must also be < 0. This is trickier than it looks.
  3. If a.equals(b) then a.compareTo(b) must return 0; the reverse is not true (a.compareTo(b) == 0? Then a or b might be equal, or not, either is fine).
  4. If a.compareTo(b) < 0 then b.compareTo(a) must be > 0. Same for == 0. This means it's hard to allow subtypes to have a meaningful comparison with supertypes unless the subtype adds no comparison-affecting state at all. Yet another reason to use Comparator instead.
  5. If a.compareTo(b) < 0 right now, it must continue to return that in the future.
like image 190
rzwitserloot Avatar answered Oct 25 '25 06:10

rzwitserloot


There is no requirement, contract or hard expectation that equals() agrees with compareTo().

Consider two horses race results, with fields horse, race and position, that tie for first place. The natural ordering of race results is by position. The two instances are not equal (they are for different horses), but they have the same rank (of first) so their relative comparison value is 0.

The actions of comparing for equality and comparing for ordering are separate concerns not usually performed together.

like image 38
Bohemian Avatar answered Oct 25 '25 04:10

Bohemian