Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Collections.sort call Comparator twice with the same arguments?

I'm running an example to understand the behavior of Comparator in Java.

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;


    class HDTV {
    private int size;
    private String brand;

    public HDTV(int size, String brand) {
        this.size = size;
        this.brand = brand;
    }

    public int getSize() {
        return size;
    }

    public void setSize(int size) {
        this.size = size;
    }

    public String getBrand() {
        return brand;
    }

    public void setBrand(String brand) {
        this.brand = brand;
    }
}

class SizeComparator implements Comparator<HDTV> {
    @Override
    public int compare(HDTV tv1, HDTV tv2) {
        int tv1Size = tv1.getSize();
        int tv2Size = tv2.getSize();
 System.out.println("Comparing :: "+tv1.getBrand()+" AND : "+tv2.getBrand());
        if (tv1Size > tv2Size) {
            return 1;
        } else if (tv1Size < tv2Size) {
            return -1;
        } else {
            return 0;
        }
    }
}

public class HelloWorld {
    public static void main(String[] args) {
        HDTV tv1 = new HDTV(55, "Samsung");
        HDTV tv2 = new HDTV(60, "Sony");
        HDTV tv3 = new HDTV(42, "Panasonic");

        ArrayList<HDTV> al = new ArrayList<HDTV>();
        al.add(tv1);
        al.add(tv2);
        al.add(tv3);

        Collections.sort(al, new SizeComparator());
        for (HDTV a : al) {
            System.out.println(a.getBrand());


        }
        }
    }

The output is

Comparing :: Sony AND :Samsung
Comparing :: Panasonic AND : Sony
Comparing :: Panasonic AND : Sony
Comparing :: Panasonic AND : Samsung
Panasonic
Samsung
Sony

Why is it comparing two Objects Panasonic and Sony 2 times consecutively?? I don't find it is required to do that.

like image 396
Anil Kumar Avatar asked Jul 01 '16 02:07

Anil Kumar


1 Answers

If this is Java 7 or later, it's using TimSort. TimSort starts off by running through the input and detecting or gathering ascending runs of 32 or more elements (in this implementation). See countRunAndMakeAscending in the source code.

Runs longer than 32 are left in place for now. Runs shorter than 32 are lengthened by doing a binary insertion sort of subsequent elements into the current run until it's at least 32 elements long. See binarySort in the source code.

(The merge sorting approach is done only after runs of >= 32 are gathered. Since your input has only 3 elements, the entire sort is done using the binary insertion sort, and no merging is done.)

What countRunAndMakeAscending has to do is to detect runs by comparing adjacent elements. First it compares Sony to Samsung and then Panasonic to Sony. The result is a run of length 2, [Samsung, Sony].

Next, binarySort lengthens this run by taking the next element, Panasonic, and inserting it into the right place. A binary search is done to find that place. The midpoint of the run of 2 is location 1, which is Sony, so it compares Panasonic to Sony. (This is the repeated comparison.) Panasonic is less than Sony, so next comparison is between Panasonic and Samsung, which determines the proper insertion point. We now have a run of length 3.

Since the entire input is of length 3, the sort is finished after four comparisons.

The duplicate comparisons occur because countRunAndMakeAscending and binarySort are separate sort phases, and it just so happens that the last comparison of the first phase is the same as the first comparison of the second phase.

like image 94
Stuart Marks Avatar answered Oct 15 '22 02:10

Stuart Marks