As part of a school exercise I would like to compare and contrast sorting algorithms as a Java exercise.
I implemented the sorting algorithms myself and we sort objects of a class Person
which implements the Comparable
interface.
So far so good, but what I can't explain is why during the first call to my sorting methods, the sorting takes longer than on subsequent calls?
The output bellow represents my results.
Best, Worst and Avg refer to the unsorted array that is passed to the sorting method:
This is my output:
1-call of the sorting methods
InsertionSort Best:1799ms Worst:78ms Avg:789ms
MergeSort Best:10ms Worst:3ms Avg:5ms
2-call of the sorting methods
InsertionSort Best:1065ms Worst:39ms Avg:691ms
MergeSort Best:3ms Worst:2ms Avg:5ms
3-call of the sorting methods
InsertionSort Best:1066ms Worst:39ms Avg:692ms
MergeSort Best:3ms Worst:2ms Avg:5ms
4-call of the sorting methods
InsertionSort Best:1065ms Worst:39ms Avg:691ms
MergeSort Best:3ms Worst:2ms Avg:5ms
Is the JVM doing any optimizations on subsequent calls?
I am puzzled and would greatly appreciate any help!
Edit: Thanks for suggestions and answers so far!
To make a few points clear - each of the calls in my output refer to the time it take for a complete sorting!
After each sorting I make a new call with UNSORTED arrays again!
My source code can be downloaded as an eclipse project as a zip file, at the following Dropbox link: dropbox link to eclipse project.zip
P.S. I have no experience with profilers - if you could point me to a tutorial or so that would be great.
But since it has the upper hand in the average cases for most inputs, Quicksort is generally considered the “fastest” sorting algorithm.
Quicksort has the 'best performance' in the 'average case' for most inputs, it is considered the 'fastest sorting algorithm' based on its storage capacity.
In C++, it is faster to process a sorted array than an unsorted array because of branch prediction. In computer architecture, a branch prediction determines whether a conditional branch (jump) in the instruction flow of a program is likely to be taken or not. Branch prediction doesn't play a significant role here.
Insertion sort runs much more efficiently if the array is already sorted or "close to sorted."
There are many things at work here, as the variety of responses indicate.
But the first run's long runtime is probably explained by JIT (just-in-time) compilation. As discussed here, your algorithm will run in the JVM for a while as interpreted bytecode. When the Hotspot monitor determines that your sort's loops are costly, the JVM will compile them to native code. After that, they'll run considerably faster. The first run is getting the disadvantage of running in the interpreter for a while plus the extra costs of compilation. This is why "warming up" is a common term in Java benchmarks.
The differences in performance on different inputs are tied to the sort algorithm. Many algorithms behave differently based on initial data organization, and many are deliberately organized to do well on initially sorted or nearly sorted data. Here is a brilliant demonstration for the case of nearly sorted input. E.g. insertion sort is quadratic time in general, but linear time on nearly sorted input (actually O((k+1)n) for input of size n where elements are no more than k positions from correctly sorted).
Then there is the branch prediction issue already referenced by link. Modern processors have various mechanisms that attempt to "guess" which way a branch (essentially an "if" statement at the machine level) will go based on recent history gathered as the program runs. The cost of a bad guess is high. The goodness of the guess is likely to be affected by both algorithm and data details.
Processing a sorted array is faster than processing an un-sorted one due to Branch Prediction.
This has been covered extensively in the most famous Stack Overflow question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With