Recently, in an interview I was asked, what exactly is a bucket in hashmap? Whether it is an array or a arraylist or what?
I got confused. I know hashmaps are backed by arrays. So can I say that bucket is an array with a capacity of 16 in the start storing hashcodes and to which linked lists have their start pointer ?
I know how a hashmap internally works, just wanted to know what exactly is a bucket in terms of data structures.
HashMap stores elements in so-called buckets and the number of buckets is called capacity. When we put a value in the map, the key's hashCode() method is used to determine the bucket in which the value will be stored. To retrieve the value, HashMap calculates the bucket in the same way – using hashCode().
When a HashMap is created its Initial Capacity will be 16, it means there will be 16 Buckets in memory where key value pair will be stored. Now if implementation of my hashCode() in such a way that its generating random number in range of 1000 and 10000.
The default load factor of HashMap is 0.75f (75% of the map size). The problem is, keeping the bucket size fixed (i.e., 16), we keep on increasing the total number of items in the map that disturbs time complexity. When we increase the total number of buckets, total items in each bucket starts increasing.
The Initial Capacity is essentially the number of buckets in the HashMap which by default is 24 = 16. A good HashMap algorithm will distribute an equal number of elements to all the buckets.
No, a bucket is each element in the array you are referring to. In earlier Java versions, each bucket contained a linked list of Map entries. In new Java versions, each bucket contains either a tree structure of entries or a linked list of entries.
From the implementation notes in Java 8:
/*
* Implementation notes.
*
* This map usually acts as a binned (bucketed) hash table, but
* when bins get too large, they are transformed into bins of
* TreeNodes, each structured similarly to those in
* java.util.TreeMap. Most methods try to use normal bins, but
* relay to TreeNode methods when applicable (simply by checking
* instanceof a node). Bins of TreeNodes may be traversed and
* used like any others, but additionally support faster lookup
* when overpopulated. However, since the vast majority of bins in
* normal use are not overpopulated, checking for existence of
* tree bins may be delayed in the course of table methods.
...
I hope this may help you to understand the implementation of hash map well.
Buckets exactly is an array of Nodes. So single bucket is an instance of class java.util.HashMap.Node. Each Node is a data structure similar to LinkedList, or may be like a TreeMap (since Java 8), HashMap decides itself what is better for performance--keep buckets as LinkedList or TreeMap. TreeMap will be only chosen in case of poorly designed hashCode() function, when lots of entries will be placed in single bucket. See how buckets look like in HashMap:
/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With