Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting algorithm using bitmask and binary operators

I'm struggling to understand how the following algorithm (in Java) works.

private static void sort(int[] values, k) 
{
    int mask = 0x00000001 << k;
    int insertIndex = 0;
    int ArrayList<Integer> big = new ArrayList<Integer>();

    for (int i = 0; i < values.length; ++i) {
        if ((values[i] & mask) == 0) 
            values[insertIndex++] = values[i];
        else
            big.add(values[i]);
    }

    for (int i = 0; i < big.size(); ++i)
        values[insertIndex++] = big.get(i);
}

public static void sort(int[] values) 
{
    for (int i = 0; i < 31; ++i)
        sort(values, i);
}

Here's what I undestand so far:

At first 0x00000001 (32 bit?) gets left-shifted by k. So mask now is some other number. Then we're checking if current value values[i] added with mask using Binary-And operator equals to 0.

I can't understand the role of (values[i] & mask) == 0 clause. Second for-loop messing with my head also. And why are we iterating in public static void sort(int[] values) only 31 times?

This algorithm doesn't sort negative integers correctly. How so? How can it be modified so that negative integers get sorted too?

It is said that this algorithm uses similar concepts of one the well-known sorting algorithms. Like Heap-sort, Insertion Sort, Quick-sort, Merge-Sort, Bucket-Sort or Radix-Sort. I eliminated the possibility of Quick-sort, because it uses partitions and recursion. Merge sort uses recursion and merges sub-arrays, so I eliminated it too. Insertion-Sort isn't likely to be the one either, because of the drastic time-complexity difference. Insertions sort O(n^2) and given algorithm is O(n). Bucket-sort doesn't actually sort anything, it just divides array in sub-arrays, which then can be sorted using some sorting algorithm. Heap-sort is comparison-based algorithm, and the given algorithms doesn't look like one. So the only possibility that's left is Radix-Sort, which is not a comparison-based algorithm. So my best bet is that the given algorithm is similar to Radix sort. Am I right or am I hopelessly lost?

like image 828
the_dude Avatar asked Dec 25 '22 02:12

the_dude


1 Answers

Yep, this is an implementation of radix sort using bitwise operators. The implementation isn't very intuitive thanks to the bitwise operators.

The algorithm works by sorting through the list one digit at a time. This is why there are 31 loops... because there are only 32 bits in a Java integer, and the leftmost bit indicates that the value is negative, so there are only 31 bits in a positive Java integer.

The mask is used to keep track of which place is being checked. 0x0001 is the ones place, 0x0002 is the twos place, 0x0004 is the threes place, and 0x0001 << n is the nth place.

The sorting is done by putting all of the integers where the checked bit is 0 on the left side of the array, and all of the integers where the checked bit is 1 on the right. This is pretty easy: mask & values[i] == 0 if the masked bit of the value is a 0.

Things get tricky when we start moving variables around. Whenever we find a value with a 0 in our checked bit, we want to move it to the left. However, this involves overwriting a value. insertIndex and i both start at 0. Each time we find a value at position 'i' that we need to move to the left, we move it to insertIndex, and if we find a value we need to move to the right, we store it in a temporary arraylist for later.

0 0 1 0 1 0   i and insertindex are the same, so we copy the 0 to itself
^

0 0 1 0 1 0   Again.
  ^

0 0 1 0 1 0   Here we find our first 1. We don't increment insertindex
    ^         We store the 1 in the temp array list. Nothing is copied.

0 0 0 0 1 0   This is a zero. We copy the zero from i to insertindex
    ^<^

0 0 0 0 1 0   Another 1 goes in the temp array list. Nothing is copied, we don't 
              increment insertindex.
      ^ ^

0 0 0 0 1 0   We copy the last zero from i to insertindex. Remember that the
      ^<<<^   zero we overwrite is actually safely in the third bit - it
              was copied earlier.

Now for the second for loop, we take all of the values we stored in our temp array list and we put them on the now-empty right side of the array. Everything on this side is somewhere else on the "left side" of the array.

0 0 0 0 1 0   Retrieve the first value from the temp array list
        ^

0 0 0 0 1 1   Retrieve the second value from the temp array list
          ^

Hope that helps! If you want to extend this to negative values, try looking at Two's Complement notation

like image 191
isharacomix Avatar answered Dec 28 '22 09:12

isharacomix