Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting ArrayList can fail when using list in comparator. Is this documented?

ArrayLists seem to be sorted with TimSort where the underlying list is not always consistent during sorting. It can happen that list entries disappear or appear twice upon calling the comparator.

In our comparator we are comparing keys for which we are using a function to get a value to compare for this key. As this function is used in other contexts we have a test whether the key is actually present in the list (something that wouldn't be necessary in the sorting):

        if (keys.contains(itemId)) {
          ...

As keys is the list we are sorting it can happen in the comparator that a key is not found in the list due to the internal mechanics of TimSort.

Question: Is this mentioned somewhere in Javadoc (couldn't find it) that you are not supposed to access underlying list in Comparator? Is this a poor implementation of TimSort that should sort a copy? Or was it a stupid idea in the first place to access underlying list in comparator?


The program below, provided by T.J. Crowder, demonstrates that the contents of the underlying list may not be consistent during calls to the Comparator. (This program demonstrates the phenomenon in question, but it isn't representative of the actual application that's affected by the issue.)

import java.util.*;

public class Example {
    private static String[] chars = {
        "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
    };

    private List<String> list;
    private String[] entries;

    private Example() {
        this.entries = new String[1000];
        for (int n = 0; n < 1000; ++n) {
            this.entries[n] = chars[n % chars.length] + n;
        }
        // Ensure it's an ArrayList, specifically
        this.list = new ArrayList<String>(Arrays.asList(this.entries));
    }

    public static void main(String[] args) {
        (new Example()).run();
    }

    class ListComparator implements Comparator<String> {
        public int compare(String a, String b) {
            for (String s : entries) {
                int i1 = Example.this.list.indexOf(s);
                if (i1 == -1) {
                    System.out.println(s + ": Missing");
                } else {
                    int i2 = Example.this.list.lastIndexOf(s);
                    if (i2 != i1) {
                        System.out.println(s + ": Duplicated, at " + i1 + " and " + i2);
                    }
                }
            }
            return a.compareTo(b);
        }
    }

    private void run() {
        this.list.sort(new ListComparator());
    }
}

Here are the first few lines of output from a run:

b1: Missing
a52: Duplicated, at 2 and 32
b27: Missing
a52: Duplicated, at 2 and 32
c2: Missing
a52: Duplicated, at 2 and 32
c2: Missing
c28: Missing
a52: Duplicated, at 2 and 32
b53: Duplicated, at 5 and 33
c28: Missing
d29: Missing
a52: Duplicated, at 2 and 32
b53: Duplicated, at 5 and 33
d3: Missing
d29: Missing
a52: Duplicated, at 2 and 32
b53: Duplicated, at 5 and 33
d3: Missing
d29: Missing
e30: Missing
like image 237
Thomas M Avatar asked Jan 06 '19 11:01

Thomas M


People also ask

How do you sort elements in an ArrayList using Comparator interface?

We can simply implement Comparator without affecting the original User-defined class. To sort an ArrayList using Comparator we need to override the compare() method provided by comparator interface. After rewriting the compare() method we need to call collections. sort() method like below.

Can ArrayList be sorted?

Approach: An ArrayList can be Sorted by using the sort() method of the Collections Class in Java. This sort() method takes the collection to be sorted as the parameter and returns a Collection sorted in the Ascending Order by default.

How does sort work with Comparator?

Using a comparator, we can sort the elements based on data members. For instance, it may be on roll no, name, age, or anything else. Method of Collections class for sorting List elements is used to sort the elements of List by the given comparator.

What happens while sorting if the object isn't comparable?

Sorting a Map with TreeMap Remember, though: if the object doesn't implement Comparable , a ClassCastException will be thrown.


1 Answers

A bit of history here: in JDK 7, the TimSort algorithm replaced the previous "legacy merge sort" algorithm. In JDK 8, Collections.sort() delegates to the new default method List.sort(). This default method is overridden by ArrayList, which does the sort in-place. The previous Collections.sort() implementation would copy the list to a temporary array, perform the sort on that temporary array, and then copy the elements from the temporary array back to the original list.

If the sort comparator looks in the list being sorted, then its behavior will certainly be affected by the new in-place sorting behavior of ArrayList introduced in JDK 8. The change from the "legacy merge sort" to TimSort in JDK 7 probably has no impact in this case, since JDK 7 still did the sorting on a temporary copy.

The copy-sort-copyback behavior of List.sort() is described in an "Implementation Requirements" section which specifies the behavior of the default method implementation, but which isn't part of the interface's contract that is imposed on all implementations. Thus, ArrayList (and other subclasses) are free to change this behavior. I note that there is no documentation for the overriding implementation ArrayList.sort(). I suppose it would be a small improvement if some documentation were added that specified the in-place sorting behavior.

If the in-place sorting of ArrayList is a problem, you could copy the list before sorting it:

List<Key> list = ... ;
List<Key> newList = new ArrayList<>(list);
newList.sort(keyComparator); // uses the old list
list = newList;

Alternatively, perhaps you could give some more details about the way the comparator works, and we might be able to figure out a way to rewrite it so that it doesn't need to look in the list being sorted. (I'd suggest asking another question for this.)

like image 179
Stuart Marks Avatar answered Sep 29 '22 06:09

Stuart Marks