Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding crossfilter.js bit-twiddling hack

I found it on crossfilter.js description and am trying to understand what this line means

Crossfilter uses sorted indexes (and a few bit-twiddling hacks) to make this possible, dramatically increasing the perfor­mance of live histograms and top-K lists.

The sorted-index refers to the use of index like in databases, I think. To avoid doing full-table scans. Each dimension has an index on which filtering is done. This leads to incremental filtering (filter each index one by one) and finally when filtered data is finally generated, events are fired.

But looking at the code I could not understand what the hack is, what is it used for and how. Can anyone explain the purpose of bit-twiddling here ?

like image 631
user568109 Avatar asked Feb 13 '23 18:02

user568109


1 Answers

At the beginning there's this declaration:

m = 0, // a bit mask representing which dimensions are in use

A "bit mask" is a simple and efficient way to keep track of a list of true/false options. Let's say you create a listing of cars for sale and you want to set four extra parameters of each car listed: 4x4, ABS, ESP, climatronic. You could define a car object like this:

MyCar = { brand : "Mercedes", 
          extras : {
              "4x4" : false,
              "ABS" : true,
              "ESP" : false,
              "climatronic" : true
          }
        }

These extras are easy to set and retrieve: if( MyCar.extras['4x4'] ){ /* display a ✓ */ }. It's easy, simple, readable but in some cases may require too many resources to process (think: nested loops over a huge collection).

So we may create an array where each index corresponds to a feature, right?

MyCar = {  brand : "Mercedes",
           // 1: 4x4, 2: ABS, 3: ESP, 4: climatronic
           extras : [ false, true, false, true ]
        }

Now to check whether given car has 4x4 you need to use if( MyCar.extras[0] ){ /* display a ✓ */ }. It's less readable (you need to consult documentation to know which feature is under which index) but probably faster an requires less memory.

The concept of a bitmask is similar, but uses numbers instead of arrays. extras : [ false, true, false, true ] could be shortened to extras : [ 0, 1, 0, 1 ] and 0101 is a binary representation of the number 5. So a Car object with a bitmask describing its features could look like this:

MyCar = {  brand : "Mercedes",
           extras : 5
        }

This looks completely unintuitive, but should be very fast and lightweight. How do you check whether the car has 4x4? Well, 4x4 is represented as 1000 (binary representation of 8) so:

if( !!(MyCar.extras & 8) ){ /* display a ✓ */ }

so yeah, now we talk about hacks just by the look of this code. It's really simple but looks cryptic if you haven't played with bitwise operators before. To explain further: our car has the extras attribute set to 5, which in binary is 0101 and we want to check if the left-most bit is set to 1. Bitwise AND operation returns a number that has 1's only where both operands have a 1.

    0101  // 5
AND 1000  // 8
  = 0000 

Now let's check for ABS, which is the 2nd bit from the left (0100, i.e. 4).

var hasABS = !!( MyCar.extras & 4 ); // true

because

    0101  // 5
AND 0100  // 4
  = 0100 

How to set the ESP to true? Simple:

MyCar.extras |= 3;  // now extras == 7

what has happened? a binary OR returns a number that has 1's where either of its operands has a 1 bit.

5 | 3 = 0101 | 0010 = 0111 = 7;

Now to set the ABS to false you'd use

MyCar.extras ^= 4;

I'll leave the workings of binary XOR as an exercise to the reader, but I hope you get the gist of using bitmasks and "bit twiddling" now.

like image 162
pawel Avatar answered Feb 15 '23 11:02

pawel