This question has been asked previously but there was no answer for it at that time so I decided to ask it again.
I need an efficient implementation of a Bloom filter in C (not C++). If there is no such thing available, I would not mind implementing one, if given some good reference so that it doesn't take too much of my time.
I want to use this data structure for inserts and tests in a ratio (1:20k), so primarily it is test-intensive. The data to be tested is 64 bit integers.
Implementation. An empty Bloom filter is a bit array of m bits, all set to 0. There are also k different hash functions, each of which maps a set element to one of the m bit positions. To add an element, feed it to the hash functions to get k bit positions, and set the bits at these positions to 1.
The advantage of a Bloom filter over the established dictionary structures is space efficiency. A Bloom filter needs only a constant number of bits per (prospective) element, while keeping the FPR constant, independent of the size of the elements' universe.
A Bloom filter is defined as a data structure designed to identify of a element's presence in a set in a rapid and memory efficient manner. A specific data structure named as probabilistic data structure is implemented as bloom filter.
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. For example, checking availability of username is set membership problem, where the set is the list of all registered username.
I have a stand-alone plain C library here which may be of use: https://github.com/jvirkki/libbloom
Not to do too much self-promotion, but I've written a plugin for the Geany editor/IDE that filters out duplicate text lines, it uses a Bloom filter.
The implementation is in C, and you can find it right here on GitHub. It's GPL v3, so depending on your exact needs you may or may not be able to use it.
Some notes about my implementation:
BloomContains
callback function pointer used by bloom_filter_new()
). Just pass NULL
to get a "pure" filter.It hasn't been heavily tuned for performance, but should be okay. I would appreciate any feedback you might have after testing it, of course!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With