I noticed something odd while doing profiling our code base. It seemed that sorting with a typed comparator (e.g. Comparator<MyClass>
) always first called a method Comparator<MyClass>.compare(Object,Object)
which then called the method Comparator<MyClass>.compare(MyClass,MyClass)
. Furthermore, the vast majority of the time was spent in the Comparator<MyClass>.compare(Object,Object)
. To explore further, I made a little test program:
public class Sandbox {
public static void main(String argv[]) {
for(int j=0; j<100; j++) {
int n = 10000;
SortMe[] sortMes = new SortMe[n];
for (int i=0; i<n; i++) {
sortMes[i] = new SortMe(Math.random());
}
Arrays.sort(sortMes, new SortMeComp());
System.out.println(Arrays.toString(sortMes));
}
for(int j=0; j<100; j++) {
int n = 10000;
SortMe[] sortMes = new SortMe[n];
for (int i=0; i<n; i++) {
sortMes[i] = new SortMe(Math.random());
}
Arrays.sort(sortMes, new SortMeCompTypeless());
System.out.println(Arrays.toString(sortMes));
}
}
}
The typed Comparator:
public class SortMeComp implements Comparator<SortMe>{
public int compare(SortMe one, SortMe two) {
if(one.getValue()>two.getValue()) {
return -1;
} else if (one.getValue()<two.getValue()) {
return 1;
} else {
return 0;
}
}
}
The untyped Comparator I made for comparison:
public class SortMeCompTypeless implements Comparator{
public int compare(Object oneObj, Object twoObj) {
SortMe one = (SortMe) oneObj;
SortMe two = (SortMe) twoObj;
if(one.getValue()>two.getValue()) {
return -1;
} else if (one.getValue()<two.getValue()) {
return 1;
} else {
return 0;
}
}
}
Here are the results (from YourKit profiler; let me know if you'd rather have a screenshot):
+----------------------------------------------------+-----------------+-----------------+--------------------+
| Name | Time (ms) | Own Time (ms) | Invocation Count |
+----------------------------------------------------+-----------------+-----------------+--------------------+
| +---java.util.Arrays.sort(Object[], Comparator) | 23,604 100 % | 8,096 | 200 |
| | | | | |
| +---SortMeComp.compare(Object, Object) | 11,395 48 % | 7,430 | 12,352,936 |
| | | | | | |
| | +---SortMeComp.compare(SortMe, SortMe) | 3,965 17 % | 3,965 | 12,352,936 |
| | | | | |
| +---SortMeCompTypeless.compare(Object, Object) | 4,113 17 % | 4,113 | 12,354,388 |
+----------------------------------------------------+-----------------+-----------------+--------------------+
I ran the profile without filtering, and you see the recursive calls to mergeSort (which just make it hard to read), but nothing of interest.
So what's going on here? Where is that method SortMeComp.compare(Object,Object) coming from? We figured it was something that Java creates internally for dealing with generics, but what could be taking so long? I would think the jvm would just treat a generic method like an "untyped"/Object method. As you can see, a simple cast is far quicker. Besides which, I would think this is exactly the kind of thing that would be JITed away even if the jvm needed to do upfront type stuff. What's going on here?
By the way:
$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
Edit:
In response to savinos answer, I tried simulating the extra method call with an 'untyped' Comparator that simply cast to a typed compare:
public class SortMeCompMethodCalls implements Comparator{
public int compare(Object oneObj, Object twoObj) {
return compare((SortMe)oneObj, (SortMe)twoObj);
}
public int compare(SortMe one, SortMe two) {
if(one.getValue()>two.getValue()) {
return -1;
} else if (one.getValue()<two.getValue()) {
return 1;
} else {
return 0;
}
}
}
Here are the results:
+---------------------------------------------------------+-----------------+-----------------+--------------------+
| Name | Time (ms) | Own Time (ms) | Invocation Count |
+---------------------------------------------------------+-----------------+-----------------+--------------------+
| +---java.util.Arrays.sort(Object[], Comparator) | 31,044 100 % | 8,061 | 200 |
| | | | | |
| +---SortMeComp.compare(Object, Object) | 11,554 37 % | 7,617 | 12,354,392 |
| | | | | | |
| | +---SortMeComp.compare(SortMe, SortMe) | 3,936 13 % | 3,936 | 12,354,392 |
| | | | | |
| +---SortMeCompMethodCalls.compare(Object, Object) | 11,427 37 % | 7,613 | 12,352,146 |
| | | | | |
| +---SortMeCompMethodCalls.compare(SortMe, SortMe) | 3,814 12 % | 3,814 | 12,352,146 |
+---------------------------------------------------------+-----------------+-----------------+--------------------+
So it looks like savinos is right! The extra time is only the extra method call (plus a little for the cast). That seems crazy to me; you'd think that'd be JITed away? Ah well.
I removed edit 2 and added it as an answer as it should've been originally.
The time complexity including the comparator is O(n * (log n) * c) , where c is the time complexity of the comparator. By default, Arrays. sort uses a comparator following the natural ordering of the content for an O(1) time complexity.
Using a comparator, we can sort the elements based on data members. For instance, it may be on roll no, name, age, or anything else. Method of Collections class for sorting List elements is used to sort the elements of List by the given comparator.
The compare Method obj1 and obj2 are the objects to be compared. This method returns zero if the objects are equal. It returns a positive value if obj1 is greater than obj2. Otherwise, a negative value is returned.
I may be wrong, but I would say that the delta between the "Object" comparator and the typed comparator (which is called by the generic one) is simply due to the extra function call.
Consider that you are doing 12,352,936 invocations, which means roughly 5.7*10^-7 seconds per function call, which is not unreasonable.
A little off topic, but you must want it to be fast...
You'll reduce the inner compare() time by around 50%, with random data, if you change it to something like:
public int compare(SortMe one, SortMe two) {
return one.getValue() - two.getValue();
}
However, this is only valid if the magnitude of the range of inputs is smaller than 2^31. If larger, the difference overflows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With