Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java fastest way to get cardinality of BitSet intersection

The function below takes two BitSets, makes a copy of the first (it must not be overridden), intersects the copy with the second (bitwise AND) and returns the cardinality of the result.

public int getIntersectionSize(BitSet bits1, BitSet bits2) {
    BitSet copy = (BitSet) bits1.clone();
    copy.and(bits2);
    return copy.cardinality();
}

I'm interested if this code can be sped up? This function is called billion of times so even a microsecond speed up makes sense plus I'm curious about the fastest possible code.

like image 411
Brandon McHomery Avatar asked Aug 02 '15 12:08

Brandon McHomery


People also ask

What is a cardinality of a BitSet?

Java BitSet | cardinality()It creates an array of bits represented by 0s and 1s. Syntax. public int cardinality() Explanation: The method returns the number of 1s in this BitSet.

What is cardinality Java?

cardinality() is an instance method of the BitSet class that is used to get the number of bits set to True . The cardinality method is defined in the BitSet class. The BitSet class is defined in the java. util package.

What does intersects do java?

The intersects(BitSet set) method of Java BitSet class returns Boolean value true or false on the basis of whether parameter BitSet has intersected with this BitSet or not. It returns true if the specified BitSet set is also true in this BitSet.


2 Answers

If you're going to use each BitSet several times, it could be worthwhile to create a long array corresponding to each BitSet. For each BitSet:

long[] longs = bitset.toLongArray();

Then you can use the following method, which avoids the overhead of creating a cloned BitSet. (This assumes that both arrays are the same length).

int getIntersectionSize(long[] bits1, long[] bits2) {
    int nBits = 0;
    for (int i=0; i<bits1.length; i++)
        nBits += Long.bitCount(bits1[i] & bits2[i]);
    return nBits;
}
like image 50
James Trimble Avatar answered Oct 22 '22 23:10

James Trimble


Here is an alternative version, but I'm not sure if it is really faster, depends on nextSetBit.

public int getIntersectionsSize(BitSet bits1, BitSet bits2) {
   int count = 0;
   int i = bits1.nextSetBit(0);
   int j = bits2.nextSetBit(0);
   while (i >= 0 && j >= 0) {
      if (i < j) {
         i = bits1.nextSetBit(i + 1);
      } else if (i > j) {
         j = bits2.nextSetBit(j + 1);
      } else {
         count++;
         i = bits1.nextSetBit(i + 1);
         j = bits2.nextSetBit(j + 1);
      }
   }
   return count;
}

The above is the readable version, hopefully good enough for the compiler, but you could optimize it manually I guess:

public int getIntersectionsSize(BitSet bits1, BitSet bits2) {
   int count = 0;
   for (int i = bits1.nextSetBit(0), j = bits2.nextSetBit(0); i >= 0 && j >= 0; ) {
      while (i < j) {
         i = bits1.nextSetBit(i + 1);
         if (i < 0)
            return count;
      }
      if (i == j) {
         count++;
         i = bits1.nextSetBit(i + 1);
      }
      while (j < i) {
         j = bits2.nextSetBit(j + 1);
         if (j < 0)
            return count;
      }
      if (i == j) {
         count++;
         j = bits2.nextSetBit(j + 1);
      }
   }
   return count;
}
like image 1
maraca Avatar answered Oct 23 '22 00:10

maraca