Efficient Hashmap Use

Question

What is the more efficient approach for using hashmaps?

A) Use multiple smaller hashmaps, or

B) store all objects in one giant hashmap?

(Assume that the hashing algorithm for the keys is fairly efficient, resulting in few collisions)

CLARIFICATION: Option B implies segregation by primary key -- i.e. no additional lookup is necessary to determine which actual hashmap to use. (For example, if the lookup keys are alphanumeric, Hashmap 1 stores the A's, Hashmap 2 stores B's, and so on.)

finnw · Accepted Answer

Definitely B. The advantage of hash tables is that the average number of comparisons per lookup is independent of the size.

If you split your map into N smaller hashmaps, you will have to search half of them on average for each lookup. If the smaller hashmaps have the same load factor that the larger map would have had, you will increase the total number of comparisons by a factor of approximately N/2.

And if the smaller hashmaps have a smaller load factor, you are wasting memory.

All that is assuming you distribute the keys randomly between the smaller hashmaps. If you distribute them according to some function of the key (e.g. a string prefix) then what you have created is a trie, which is efficient for some applications (e.g. auto-complete in web forms.)

Jon Skeet · Answer

Are these maps used in logically distinct places? For instance, I wouldn't have one map containing users, cached query results, loggers etc, just because you happen to know the keys won't clash. However, I equally wouldn't split up a single map into multiple maps.

Keep one hashmap for each logical mapping from key to value.

Efficient Hashmap Use

Tags:

performance

optimization

hashmap

data-structures

Tony the Pony

2 Answers

finnw

Jon Skeet

Recent Activity

Donate For Us

Efficient Hashmap Use

Tags:

performance

optimization

hashmap

data-structures

Tony the Pony

2 Answers

finnw

Jon Skeet

Related questions

Recent Activity

Donate For Us