Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are some alternatives to a bit array?

Tags:

People also ask

What are bit arrays used for?

Bit arrays can be used for the allocation of memory pages, inodes, disk sectors, etc. In such cases, the term bitmap may be used. However, this term is frequently used to refer to raster images, which may use multiple bits per pixel.

Is there a bit array in Java?

This Java program is to Implement Bit Array. A bit array (also known as bitmap, bitset, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure.

What is bit array in C++?

A Bit Array is an array data structures that compactly stores data. It is basically used to implement a simple data structure.


I have an information retrieval application that creates bit arrays on the order of 10s of million bits. The number of "set" bits in the array varies widely, from all clear to all set. Currently, I'm using a straight-forward bit array (java.util.BitSet), so each of my bit arrays takes several megabytes.

My plan is to look at the cardinality of the first N bits, then make a decision about what data structure to use for the remainder. Clearly some data structures are better for very sparse bit arrays, and others when roughly half the bits are set (when most bits are set, I can use negation to treat it as a sparse set of zeroes).

  • What structures might be good at each extreme?
  • Are there any in the middle?

Here are a few constraints or hints:

  1. The bits are set only once, and in index order.
  2. I need 100% accuracy, so something like a Bloom filter isn't good enough.
  3. After the set is built, I need to be able to efficiently iterate over the "set" bits.
  4. The bits are randomly distributed, so run-length–encoding algorithms aren't likely to be much better than a simple list of bit indexes.
  5. I'm trying to optimize memory utilization, but speed still carries some weight.

Something with an open source Java implementation is helpful, but not strictly necessary. I'm more interested in the fundamentals.