I have implemented my own hash table functions in C, but currently it doesn't support resizing. I was wondering what algorithms do exist apart from the brute-force way of creating a new empty hash table and moving everything there?
Resizing a hash table consists of choosing a new hash function to map to the new size, creating a hash table of the new size, iterating through the elements of the old table, and inserting them into the new table. .
A hash table needs to be resized if load factor of a table exceeds 0.7.
A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.
There is incremental resizing.
From Wikipedia:
Incremental resizing
Some hash table implementations, notably in real-time systems, cannot pay the price of enlarging the hash table all at once, because it may interrupt time-critical operations. If one cannot avoid dynamic resizing, a solution is to perform the resizing gradually:
During the resize, allocate the new hash table, but keep the old table unchanged. In each lookup or delete operation, check both tables. Perform insertion operations only in the new table. At each insertion also move r elements from the old table to the new table. When all elements are removed from the old table, deallocate it.
To ensure that the old table will be completely copied over before the new table itself needs to be enlarged, it is necessary to increase the size of the table by a factor of at least (r + 1)/r during the resizing.
So this is not some clever way of moving all of the elements from the old table into the new table (and if there is one, I haven't seen it); rather, it eases the burden of resizing by allowing the migration to happen gradually.
Wikipedia has some words of wisdom on the subject.
Also, it's not a solution, but could be a part of one - if you're under windows you might use the VirtualAlloc family of functions which allow you to reserve address space without actually committing memory pages. That is, in laymans terms, you would do something like a "malloc" and tell it to "reserve 1000MB, but only make the first 10 available". So if you write past the 10MB, you'd get the usual crash. But when the time comes to expand, you just say "OK, give me another 10MB after the first ones". And the next 10MB is made available at the address directly after the first 10MB. It's like resizing an array. The actual RAM in use will be only as much as you need, but the memory addresses will be reserved in advance so that other memory allocation operations don't use them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With