We know that several sorts, such as insertion sort, are great on arrays that are 'mostly-sorted' and not so great on random data.
Suppose we wanted to profile the performance improvement/degradation of such an algorithm relative to how 'sorted' the input data is. What would be a good way to generate an 'increasingly sorted' or 'increasingly random' array of elements? How might we measure the 'sortedness' of the input?
Number of Inversion is a usual measure of how much sorted an array is.
A pair of elements (pi,pj)
in permutation p is called an inversion in a permutation if i<j
and pi >pj
. For example, in the permutation (3,1,2,5,4)
contains the 3 inversions (3,1)
, (3,2)
and (5,4)
.
A sorted array got 0 inversion and reverse sorted array got n*(n-1)/2.
You could generate a "partially sorted" dataset by interrupting a modern Fisher-Yates shuffle run on an already ordered dataset.
Also, if you only need a few essentially fixed sets of partially sorted data, then you could generate a column graph of position vs value for each and just eye-ball them. That would let you quickly see the general random-ness of a set, as well things like how much localised order there is.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With