From the:
http://svn.python.org/projects/python/trunk/Objects/listsort.txt
and:
http://en.wikipedia.org/wiki/Timsort
I see, that Timsort has some optimizations when a0 > a1 > a2 > ...
, but what about next array:
10000,10000,9999,9999,9998,9998,....,9,9,8,8,7,7,6,6,5,5,4,4,3,3,2,2,1,1,0,0
What is a time efficiency for such array?
(integers were used to simplify an example, stable sorting is required) I have made some measurements and, seems, such arrays are not "good" cases for Timsort.
actually, TimSort in JDK http://cr.openjdk.java.net/~martin/webrevs/openjdk7/timsort/raw_files/new/src/share/classes/java/util/TimSort.java has a method "countRunAndMakeAscending"
@SuppressWarnings("unchecked")
private static int countRunAndMakeAscending(Object[] a, int lo, int hi) {
assert lo < hi;
int runHi = lo + 1;
if (runHi == hi)
return 1;
// Find end of run, and reverse range if descending
if (((Comparable) a[runHi++]).compareTo(a[lo]) < 0) { // Descending
while(runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) < 0)
runHi++;
reverseRange(a, lo, runHi);
} else { // Ascending
while (runHi < hi && ((Comparable) a[runHi]).compareTo(a[runHi - 1]) >= 0)
runHi++;
}
return runHi - lo;
}
why not to implement it in another way:
private static int countRunAndMakeAscending(Object[] a, int lo, int hi) {
int runHi = lo;
int lastEqual = lo;
int ascending = 0;
while (++runHi < hi) {
int c = ((Comparable) a[runHi+1]).compareTo(a[runHi]);
if (ascending == 0) {
if (c != 0) {
if (c > 0) {
ascending = 1;
} else {
ascending = -1;
reverseRange(a, lastEqual, runHi);
lastEqual = runHi;
}
}
} else if (ascending == 1) {
if (c < 0) {
return runHi - lo;
}
} else {
if (c > 0) {
reverseRange(a, lastEqual, runHi);
reverseRange(a, lo, runHi);
return runHi - lo;
} else if (c < 0) {
reverseRange(a, lastEqual, runHi);
lastEqual = runHi;
}
}
}
if (ascending == -1) {
reverseRange(a, lastEqual, runHi);
reverseRange(a, lo, runHi);
}
return runHi - lo;
}
so it will work fine with non ascending order?
Yes.
Basically it decided that "ascending" really meant "not descending", without any loss of generality -- in case you have eg [5,5,4 3] it will just break it into [5,5] (ascending) and then [4,3] (descending) on the next call.
As to why, I guess it's for simplicity: simply try counting the number of invocations of reverseRange()
in your code and in the original and you'll get the idea (I got it by noticing how long it took me to understand one version compared to the other :)
edit: WRONG WRONG WRONG! As Oscar Smith pointed out, the reason is to make timsort a stable sorting algorithm. If anyone knows how to transfer the undeserved bounty...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With