Profiling sorting algorithms against partially sorted data

Question

We know that several sorts, such as insertion sort, are great on arrays that are 'mostly-sorted' and not so great on random data.

Suppose we wanted to profile the performance improvement/degradation of such an algorithm relative to how 'sorted' the input data is. What would be a good way to generate an 'increasingly sorted' or 'increasingly random' array of elements? How might we measure the 'sortedness' of the input?

Zimbabao · Accepted Answer

Number of Inversion is a usual measure of how much sorted an array is.

A pair of elements (pi,pj) in permutation p is called an inversion in a permutation if i<j and pi >pj. For example, in the permutation (3,1,2,5,4) contains the 3 inversions (3,1), (3,2) and (5,4).

A sorted array got 0 inversion and reverse sorted array got n*(n-1)/2.

staticsan · Answer

You could generate a "partially sorted" dataset by interrupting a modern Fisher-Yates shuffle run on an already ordered dataset.

Also, if you only need a few essentially fixed sets of partially sorted data, then you could generate a column graph of position vs value for each and just eye-ball them. That would let you quickly see the general random-ness of a set, as well things like how much localised order there is.

Profiling sorting algorithms against partially sorted data

Tags:

algorithm

sorting

int3

2 Answers

Zimbabao

staticsan

Recent Activity

Donate For Us

Profiling sorting algorithms against partially sorted data

Tags:

algorithm

sorting

int3

2 Answers

Zimbabao

staticsan

Related questions

Recent Activity

Donate For Us