Run time to insert n elements into an empty hash table

Question

People say it takes amortized O(1) to put into a hash table. Therefore, putting n elements must be O(n). That's not true for large n, however, since as an answerer said, "All you need to satisfy expected amortized O(1) is to expand the table and rehash everything with a new random hash function any time there is a collision."

So: what is the average running-time of inserting n elements into a hash table? I realize this is probably implementation-dependent, so mention what type of implementation you're talking about.

For example, if there are (log n) equally spaced collisions, and each collision takes O(k) to resolve, where k is the current size of the hashtable, then you'd have this recurrence relation:

T(n) = T(n/2) + n/2 + n/2

(that is, you take the time to insert n/2 elements, then you have a collision, taking n/2 to resolve, then you do the remaining n/2 inserts without a collision). This still ends up being O(n), so yay. But is this reasonable?

Paul Sonier · Accepted Answer

It completely depends on how inefficient your rehashing is. Specifically, if you can properly estimate the expected size of your hashtable the second time, your runtime still approaches O(n). Effectively, you have to specify how inefficient your rehash size calculation is before you can determine the expected order.

Captain Segfault · Answer

People say it takes amortized O(1) to put into a hash table.

From a theoretical standpoint, it is expected amortized O(1).

Hash tables are fundamentally a randomized data structure, in the same sense that quicksort is a randomized algorithm. You need to generate your hash functions with some randomness, or else there exist pathological inputs which are not O(1).

You can achieve expected amortized O(1) using dynamic perfect hashing:

The naive idea I originally posted was to rehash with a new random hash function on every collision. (See also perfect hash functions) The problem with this is that this requires O(n^2) space, from birthday paradox.

The solution is to have two hash tables, with the second table for collisions; resolve collisions on that second table by rebuilding it. That table will have O(\sqrt{n}) elements, so would grow to O(n) size.

In practice you often just use a fixed hash function because you can assume (or don't care if) your input is pathological, much like you often quicksort without prerandomizing the input.

Run time to insert n elements into an empty hash table

Tags:

algorithm

hashmap

hashtable

hash

runtime

Claudiu

2 Answers

Paul Sonier

Captain Segfault

Recent Activity

Donate For Us

Run time to insert n elements into an empty hash table

Tags:

algorithm

hashmap

hashtable

hash

runtime

Claudiu

2 Answers

Paul Sonier

Captain Segfault

Related questions

Recent Activity

Donate For Us