Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird step in counting sort from Intro to Algorithms

This seems to be a very common book (Cormen, Leiserson, Rivest, Stein) so hopefully someone will be able to assist. In chapter 8, the algorithm for counting sort is given. It makes sense where you have the input array A and you find the range from 0 to k for the size that array C will be. C[i] is then made to contain the number of elements in A equal to i. For example:

A: [2,5,3,0,2,3,0,3]
C: [2,0,2,3,0,1]

But after this they make it so that C[i] contains the number of elements less than or equal to i. For example:

C: [2,2,4,7,7,8]

Why is this necessary? Couldn't you just iterate through the original C and have a sorted array from that? You know the exact count of each number so you could just go in order putting the right amount of each number in B and have a sorted array. Does transforming C from the first form to the second form somehow make it stable?

like image 463
telkins Avatar asked Oct 05 '22 23:10

telkins


1 Answers

I suppose you are proposing to do the following with the intermediate C (using index 1 arrays):

i = 1
for k = 1 to len(C)
  for j = 1 to C[i]
    B[i] = k
    i = i + 1

This seems to be reasonable, and has the same asymptotic running time. However, consider the case where the items whose keys you are sorting on are not just single integers, but have some other data attached to them. The counting algorithm makes the sort stable; relative orders of items with same keys are preserved (see the Wikipedia article). You lose the ability to sort general data if you just assign the output from the indices of C. Hence why the sort assigns elements via B[C[A[j]]] <- A[j].

For others who are curious, this is the completion of the original algorithm:

# C[i] now contains the number of elements equal to i.
for i = 1 to k
  C[i] <- C[i] + C[i-1]
# C[i] now contains the number of elements less than or equal to i.
for j = length[A] downto 1
  B[C[A[j]]] <- A[j]
  C[A[j]] <- C[A[j]] - 1

To explain the decrement in the last part, I cite the book, which also explains the stability of the sort:

Because the elements might not be distinct, we decrement C[A[j]] each time we place a value A[j] into the B array. Decrementing C[A[j]] causes the next input element with a value equal to A[j], if one exists, to go to the position immediately before A[j] in the output array.

Also, if we did that, I guess we wouldn't be able to call it COUNTING-SORT anymore because it wouldn't be counting the number of items less than any particular item in the input array (as they define it). :)

like image 180
Andrew Mao Avatar answered Oct 13 '22 11:10

Andrew Mao