Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast code for searching bit-array for contiguous set/clear bits?

Is there some reasonably fast code out there which can help me quickly search a large bitmap (a few megabytes) for runs of contiguous zero or one bits?

By "reasonably fast" I mean something that can take advantage of the machine word size and compare entire words at once, instead of doing bit-by-bit analysis which is horrifically slow (such as one does with vector<bool>).

It's very useful for e.g. searching the bitmap of a volume for free space (for defragmentation, etc.).

like image 378
user541686 Avatar asked Jul 30 '12 11:07

user541686


1 Answers

Windows has an RTL_BITMAP data structure one can use along with its APIs.

But I needed the code for this sometime ago, and so I wrote it here (warning, it's a little ugly):
https://gist.github.com/3206128

I have only partially tested it, so it might still have bugs (especially on reverse). But a recent version (only slightly different from this one) seemed to be usable for me, so it's worth a try.

The fundamental operation for the entire thing is being able to -- quickly -- find the length of a run of bits:

long long GetRunLength(
    const void *const pBitmap, unsigned long long nBitmapBits,
    long long startInclusive, long long endExclusive,
    const bool reverse, /*out*/ bool *pBit);

Everything else should be easy to build upon this, given its versatility.

I tried to include some SSE code, but it didn't noticeably improve the performance. However, in general, the code is many times faster than doing bit-by-bit analysis, so I think it might be useful.

It should be easy to test if you can get a hold of vector<bool>'s buffer somehow -- and if you're on Visual C++, then there's a function I included which does that for you. If you find bugs, feel free to let me know.

like image 106
user541686 Avatar answered Oct 01 '22 23:10

user541686