I'm trying to create an efficient look-up table in C.
I have an integer as a key and a variable length char*
as the value.
I've looked at uthash
, but this requires a fixed length char*
value. If I make this a big number, then I'm using too much memory.
struct my_struct {
int key;
char value[10];
UT_hash_handle hh;
};
Has anyone got any pointers? Any insight greatly appreciated.
Thanks everyone for the answers. I've gone with uthash
and defined my own custom struct to accommodate my data.
You first have to think of your collision strategy:
We'll pick 1.
Then you have to choose a nicely distributed hash function. For the example, we'll pick
int hash_fun(int key, int try, int max) {
return (key + try) % max;
}
If you need something better, maybe have a look at the middle-squared method.
Then, you'll have to decide, what a hash table is.
struct hash_table {
int max;
int number_of_elements;
struct my_struct **elements;
};
Then, we'll have to define how to insert and to retrieve.
int hash_insert(struct my_struct *data, struct hash_table *hash_table) {
int try, hash;
if(hash_table->number_of_elements >= hash_table->max) {
return 0; // FULL
}
for(try = 0; true; try++) {
hash = hash_fun(data->key, try, hash_table->max);
if(hash_table->elements[hash] == 0) { // empty cell
hash_table->elements[hash] = data;
hash_table->number_of_elements++;
return 1;
}
}
return 0;
}
struct my_struct *hash_retrieve(int key, struct hash_table *hash_table) {
int try, hash;
for(try = 0; true; try++) {
hash = hash_fun(key, try, hash_table->max);
if(hash_table->elements[hash] == 0) {
return 0; // Nothing found
}
if(hash_table->elements[hash]->key == key) {
return hash_table->elements[hash];
}
}
return 0;
}
And least a method to remove:
int hash_delete(int key, struct hash_table *hash_table) {
int try, hash;
for(try = 0; true; try++) {
hash = hash_fun(key, try, hash_table->max);
if(hash_table->elements[hash] == 0) {
return 0; // Nothing found
}
if(hash_table->elements[hash]->key == key) {
hash_table->number_of_elements--;
hash_table->elements[hash] = 0;
return 1; // Success
}
}
return 0;
}
Declare the value
field as void *value
.
This way you can have any type of data as the value, but the responsibility for allocating and freeing it will be delegated to the client code.
It really depends on the distribution of your key field. For example, if it's a unique value always between 0 and 255 inclusive, just use key % 256
to select the bucket and you have a perfect hash.
If it's equally distributed across all possible int
values, any function which gives you an equally distributed hash value will do (such as the afore-mentioned key % 256
) albeit with multiple values in each bucket.
Without knowing the distribution, it's a little hard to talk about efficient hashes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With