I have alot of different sorting algorithms which all have the following signature:
void <METHOD>_sort_ints(int * array, const unsigned int ARRAY_LENGTH);
Are there any testing suites for sorting which I could use for the purpose of making empirical comparisons?
This detailed discussion, as well as linking to a large number of related web pages you are likely to find useful, also describes a useful set of input data for testing sorting algorithms (see the linked page for reasons). Summarising:
The definitive study of sorting is Bob Sedgewick's doctoral dissertation. But there's a lot of good information in his algorithms textbooks, and those are the first two places I would look for test suite and methodology. If you've had a recent course you'll know more than I do; last time I had a course, the best method was to use quicksort down to partitions of size 12, then run insertion sort on the entire array. But the answers change as quickly as the hardware.
Jon Bentley's Programming Perls books have some other info on sorting.
You can quickly whip up a test suite containing
Random integers
Sorted integers
Reverse sorted integers
Sorted integers, mildly perturbed
If memory serves, these are the most important cases for a sort algorithm.
If you're looking to sort arrays that won't fit in cache, you'll need to measure cache effects. valgrind
is effective if slow.
sortperf.py has a well selected suite of benchmark test cases and was used to support the essay found here and make timsort THE sort in Python lo that many years ago. Note that, at long last, Java may be moving to timsort too, thanks to Josh Block (see here), so I imagine they have written their own version of the benchmark test cases -- however, I can't easily find a reference to it. (timsort, a stable, adaptive, iterative natural mergesort variant, is especially suited to languages with reference-to-object semantics like Python and Java, where "data movement" is relatively cheap [[since all that's ever being moved is references aka pointers, not blobs of unbounded size;-)]], but comparisons can be relatively costly [[since there is no upper bound to the complexity of a comparison function -- but then this holds for any language where sorting may be customized via a custom comparison or key-extraction function]]).
This site shows the various sorting algorithms using four groups: http://www.sorting-algorithms.com/
In addition to the four group in Norman's answer you want to check the sorting algorithms with collection of numbers that have a few similarities in the numbers:
Changing the number of elements in the collection is also a good practice check each algorithm with 1K, 1M, 1G etc. to see what are the memory implications of that algorithm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With