I'm trying to create a generic hash table in C. I've read a few different implementations, and came across a couple of different approaches.
The first is to use macros like this: http://attractivechaos.awardspace.com/khash.h.html
And the second is to use a struct with 2 void pointers like this:
struct hashmap_entry
{
void *key;
void *value;
};
From what I can tell this approach isn't great because it means that each entry in the map requires at least 2 allocations: one for the key and one for the value, regardless of the data types being stored. (Is that right???)
I haven't been able to find a decent way of keeping it generic without going the macro route. Does anyone have any tips or examples that might help me out?
C does not provide what you need directly, nevertheless you may want to do something like this:
Imagine that your hash table is a fixed size array of double linked lists and it is OK that items are always allocated/destroyed on the application layer. These conditions will not work for every case, but in many cases they will. Then you will have these data structures and sketches of functions and protototypes:
struct HashItemCore
{
HashItemCore *m_prev;
HashItemCore *m_next;
};
struct HashTable
{
HashItemCore m_data[256]; // This is actually array of circled
// double linked lists.
int (*GetHashValue)(HashItemCore *item);
bool (*CompareItems)(HashItemCore *item1, HashItemCore *item2);
void (*ReleaseItem)(HashItemCore *item);
};
void InitHash(HashTable *table)
{
// Ensure that user provided the callbacks.
assert(table->GetHashValue != NULL && table->CompareItems != NULL && table->ReleaseItem != NULL);
// Init all double linked lists. Pointers of empty list should point to themselves.
for (int i=0; i<256; ++i)
table->m_data.m_prev = table->m_data.m_next = table->m_data+i;
}
void AddToHash(HashTable *table, void *item);
void *GetFromHash(HashTable *table, void *item);
....
void *ClearHash(HashTable *table);
In these functions you need to implement the logic of the hash table. While working they will be calling user defined callbacks to find out the index of the slot and if items are identical or not.
The users of this table should define their own structures and callback functions for every pair of types that they want to use:
struct HashItemK1V1
{
HashItemCore m_core;
K1 key;
V1 value;
};
int CalcHashK1V1(void *p)
{
HashItemK1V1 *param = (HashItemK1V1*)p;
// App code.
}
bool CompareK1V1(void *p1, void *p2)
{
HashItemK1V1 *param1 = (HashItemK1V1*)p1;
HashItemK1V1 *param2 = (HashItemK1V1*)p2;
// App code.
}
void FreeK1V1(void *p)
{
HashItemK1V1 *param = (HashItemK1V1*)p;
// App code if needed.
free(p);
}
This approach will not provide type safety because items will be passed around as void pointers assuming that every application structure starts with HashItemCore
member. This will be sort of hand made polymorphysm. This is maybe not perfect, but this will work.
I implemented this approach in C++ using templates. But if you will strip out all fancies of C++, in the nutshell it will be exactly what I described above. I used my table in multiple projects and it worked like charm.
A generic hashtable in C is a bad idea.
IIRC, the linux kernel uses macros to create and maintain (some of?) its hashtables.
C does not have generic data types, so what you want to do (no extra allocations and no void*
casting) is not really possible. You can use macros to generate the right data functions/structs on the fly, but you're trying to avoid macros as well.
So you need to give up at least one of your ideas.
You could have a generic data structure without extra allocations by allocating something like:
size_t key_len;
size_t val_len;
char key[];
char val[];
in one go and then handing out either void pointers, or adding an api for each specific type.
Alternatively, if you have a limited number of types you need to handle, you could also tag the value with the right one so now each entry contains:
size_t key_len;
size_t val_len;
int val_type;
char key[];
char val[];
but in the API at least you can verify that the requested type is the right one.
Otherwise, to make everything generic, you're left with either macros, or changing the language.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With