Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Proper way of sorting Java beans by multiple fields

We have code with complex Comparators that have been used for sorting java objects throughout our application. Historically these have worked, but since the introduction of TimSort in Java 7 we occasionally get the Comparison method violates its general contract! error.. depending on what data is held within the object.

Here is an example of one of our legacy Comparators (which could be almost a decade old - excuse the dodgyiness):

 public int compare(TemplateBean b1, TemplateBean b2) {

  // avoid null pointer exceptions
  if (b1 == null && b2 == null) return 0;
  if (b1 == null) return 1;
  if (b2 == null) return -1;

  int cmp = 0;
  if ("UNATTACHED".equals(b1.getStatusCode()) &&
     !"UNATTACHED".equals(b2.getStatusCode())) {
     cmp = 1;
  }
  if (!"UNATTACHED".equals(b1.getStatusCode()) &&
     "UNATTACHED".equals(b2.getStatusCode())) {
     cmp = -1;
  }
  if (!"UNATTACHED".equals(b1.getStatusCode()) &&
     !"UNATTACHED".equals(b2.getStatusCode()) &&
     !"FIELDSIMPLE".equals(b1.getRefRltshpTypeCode()) &&
     !"FIELDSIMPLE".equals(b2.getRefRltshpTypeCode()) &&
     !"CUSTOM".equals(b1.getRefRltshpTypeCode()) &&
     !"CUSTOM".equals(b2.getRefRltshpTypeCode()) &&
     !"FUNCTION".equals(b1.getRefRltshpTypeCode()) &&
     !"FUNCTION".equals(b2.getRefRltshpTypeCode())) {
     String parent1 = b1.getGroupCode() == null ? "" : b1.getGroupCode().toUpperCase();
     String parent2 = b2.getGroupCode() == null ? "" : b2.getGroupCode().toUpperCase();
     cmp = parent1.compareTo(parent2);
  }

  if (cmp == 0) {
     Integer i1 = b1.getSortOrder() == null ? Const.ZERO : b1.getSortOrder();
     Integer i2 = b2.getSortOrder() == null ? Const.ZERO : b2.getSortOrder();
     cmp = i1.compareTo(i2);
  }

  if (cmp == 0) {
     String s1 = b1.getShortDescription();
     if (s1 == null) s1 = "";
     String s2 = b2.getShortDescription();
     if (s2 == null) s2 = "";
     cmp = s1.compareToIgnoreCase(s2);
  }

  return cmp;  }

So, I want to replicate this functionality, but with a Comparator that is safe for use with TimSort.

From the code you can see that there a multiple levels to this comparison..

  1. It will compare the group codes.
  2. If Group codes are the same it will compare sort order.
  3. If Sort order is the same it will compare description.

Which means it will return the result of the compare at a particular level. Which might be the comparison result of two strings or two integers. I think this is what is breaking the TimSort.

The only way I have been able to make this Comparator work around the General Contract issue is by hashing the contents of the bean and performing a string comparison. Other ideas have included writing our own sorting function.. Surely there is a better way?

Should the bean be constructed in another way to support this?

like image 892
Sprooose Avatar asked Oct 23 '13 01:10

Sprooose


2 Answers

The main problem with the above Comparator is that it is not transitive. It may seem to 'work' on older JDKs because they didn't provide detection for broken comparators, but it just couldn't work correctly in general case and buggy behavior was not revealed until JDK 7.

The source of its non-transitiveness is in conditional comparison on groupCode property. Consider situation when comparator orders objects A and B as A < B due to sortOrder field omitting comparison by groupCode because "FUNCTION".equals(B.getRefRltshpTypeCode()) and objects B and C are ordered as B < C due to sortOrder. But it is possible that A and C when compared directly are ordered as C < A due to groupCode comparison. And this breaks transitivity requirement for Comparator.

To fix this problem groupCode should be always taken into account and every object for which groupCode is skipped due to refRltshpTypeCode value should be treated for example as smaller than any object for which groupCode is now used for comparison.

Compare method should looks something like (this is just to give you the idea):

public int compare(TemplateBean b1, TemplateBean b2) {

    // avoid null pointer exceptions
    if (b1 == null && b2 == null) return 0;
    if (b1 == null) return 1;
    if (b2 == null) return -1;

    int cmp = 0;
    if ("UNATTACHED".equals(b1.getStatusCode()) &&
       !"UNATTACHED".equals(b2.getStatusCode())) {
        cmp = 1;
    }
    if (!"UNATTACHED".equals(b1.getStatusCode()) &&
       "UNATTACHED".equals(b2.getStatusCode())) {
       cmp = -1;
    }

    if (shouldBeComparenByGroupCode(b1) != shouldBeComparedByGroupCode(b2)) {
        if (!shouldBeComparenByGroupCode(b1)) {
            return -1;
        } else {
           return 1;
        }
    }

    if (shouldBeComparenByGroupCode(b1) && shouldBeComparenByGroupCode(b2)) {
        String parent1 = b1.getGroupCode() == null ? "" : b1.getGroupCode().toUpperCase();
        String parent2 = b2.getGroupCode() == null ? "" : b2.getGroupCode().toUpperCase();
        cmp = parent1.compareTo(parent2);
    }

    if (cmp == 0) {
        Integer i1 = b1.getSortOrder() == null ? Const.ZERO : b1.getSortOrder();
        Integer i2 = b2.getSortOrder() == null ? Const.ZERO : b2.getSortOrder();
        cmp = i1.compareTo(i2);
    }

    if (cmp == 0) {
        String s1 = b1.getShortDescription();
        if (s1 == null) s1 = "";
        String s2 = b2.getShortDescription();
        if (s2 == null) s2 = "";
        cmp = s1.compareToIgnoreCase(s2);
    }

    return cmp;
}

where

private static boolean shouldBeComparenByGroupCode(TemplateBean b1) {
     return !"UNATTACHED".equals(b1.getStatusCode()) &&
            !"FIELDSIMPLE".equals(b1.getRefRltshpTypeCode()) &&
            !"CUSTOM".equals(b1.getRefRltshpTypeCode()) &&
            !"FUNCTION".equals(b1.getRefRltshpTypeCode());
}
like image 145
Roman Konoval Avatar answered Sep 30 '22 18:09

Roman Konoval


The answer from @RomanKonovai is correct, however adding some more details.

Think about how the code compares these three objects, and assume that all non-reference:

               A          B          C
Status         UNATTACHED UNATTACHED UNATTACHED
RefRltshpType  CUSTOM     FUNCTION   CUSTOM
Group          Cat        Ball       Apple
SortOrder      10         20         30

Going through the implementation in the question, we can see that A < B, and B < C, and C < A. In other words A < B < C < A, or A < A. This is clearly not logical, and happens since depending on the values of Status and RefRltshpType the sort order is determined by either Group or SortOrder, and there is nothing to tie these two together. In essence this means that your sort order is undefined, since the result is entirely dependent on what order the input is in, that is sort(sort(List)) may not give the same result as sort(List).

The way to fix this is to do something like:

private int objectCompare(String allowed, Comparable v1, Comparable v2) {
  if (v1 == v2) return 0;
  if (v1 == null) return 1;
  if (v2 == null) return -1;
  boolean c1 = v1.equals(allowed);
  boolean c2 = v2.equals(allowed);
  return c1 ? c2 ? 0 : 1 : c2 ? -1 : 0;
}
private int objectCompare(Comparable v1, Comparable v2) {
  if (v1 == v2) return 0;
  if (v1 == null) return 1;
  if (v2 == null) return -1;
  return v1.compare(v2);
}
public int compare(TemplateBean b1, TemplateBean b2) {

  // avoid null pointer exceptions
  if (b1 == b2) return 0;
  if (b1 == null) return 1;
  if (b2 == null) return -1;

  int cmp = objectCompare("UNATTACHED", b1.getStatusCode(), b2.getStatusCode());
  if (cmp == 0) {
    cmp = objectCompare("FIELDSIMPLE", b1.getRefRltshpTypeCode(), b2.getRefRltshpTypeCode());
    if (cmp == 0) {
      cmp = objectCompare("CUSTOM", b1.getRefRltshpTypeCode(), b2.getRefRltshpTypeCode());
      if (cmp == 0) {
        cmp = objectCompare("FUNCTION", b1.getRefRltshpTypeCode(), b2.getRefRltshpTypeCode());
        if (cmp == 0) {
          cmp = objectCompare(b1.getGroupCode(), b2.getGroupCode());
          if (cmp == 0) {
            cmp = objectCompare(b1.getSortOrder(), b2.getSortOrder());
            if (cmp == 0) {
              cmp = objectCompare(b1.getShortDescription(), b2.getShortDescription());
            }
          }
        }
      }
    }
  }

  return cmp;
}
like image 29
Paul Wagland Avatar answered Sep 30 '22 20:09

Paul Wagland