Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the space complexity of a hash table?

What is size of a hash table with 32 bit key and 32 bit pointers to values stored separately?

Is it going to be 2^32 slots * (4 Bytes (key) + 4 Bytes (pointers to values)) = 4 * 10^9 * (4 + 4) = 32GB ?

I am trying to understand space complexity of hash tables.

like image 479
Megha Joshi - GoogleTV DevRel Avatar asked Jun 30 '11 04:06

Megha Joshi - GoogleTV DevRel


Video Answer


2 Answers

I think you are asking the wrong question. The space complexity of a datastructure indicates how much space it occupies in relation to the amount of elements it holds. For example a space complexity of O(1) would mean that the datastructure alway consumes constant space no matter how many elements you put in there. O(n) would mean that the space consumption grows linearly with the amount of elements in it.

A hashtable typically has a space complexity of O(n).

So to answer your question: It depends on the number of elements it currently stores and in real world also on the actual implementation.

A lower bound for the memory consumption of your hashtable is: (Number of Values to Store) * (SizeOf a Value). So if you want to store 1 million values in the hashtable and each occupies 4 bytes then it will consume at least 4 million bytes (roughly 4MB). Usually real world implementations use a bit more memory for infrastructure but again: this highly depends on the actual implementation and there is no way to find out for sure but to measure it.

like image 105
ChrisWue Avatar answered Sep 16 '22 12:09

ChrisWue


Hash tables don't match hash function values and slots. The hash function is computed modulo the size of a reference vector that is much smaller than the hash function range. Because this value is fixed, it is not considered in the space complexity computation.

Consequently, the space complexity of every reasonable hash table is O(n).

In general, this works out quite well. While the key space may be large, the number of values to store is usually quite easily predictable. Certainly, the amount of memory that is functionally acceptable for data structure overhead is typically obvious.

This is why hash tables are so ubiquitous. They often provide the best data structure for a given task, mixing strictly bounded memory overhead with better than log2n time complexity. I love binary trees but they don't usually beat hash tables.

like image 22
DigitalRoss Avatar answered Sep 20 '22 12:09

DigitalRoss