Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does HashSet do the sort job internal?

My Set is sometimes sorted, and sometimes not.

Here is the example:

public class SetOfInteger {
    public static void main(String[] args) {
        Random rand = new Random(47);
        Set<Integer> intset = new HashSet<>();
        for (int i = 0; i < 10; i++) {
            int j = rand.nextInt(30);
            System.out.print(j + " ");
            intset.add(j);
        }
        System.out.println();
        System.out.println(intset);
    }
}

The result shows that the set is not sorted.

8 5 13 11 1 29 28 20 12 7 
[1, 20, 5, 7, 8, 11, 12, 29, 28, 13]

When I change the termination expression to i < 20 in the for statement, the result shows that the set become sorted.

8 5 13 11 1 29 28 20 12 7 18 18 21 19 29 28 28 1 20 28 
[1, 5, 7, 8, 11, 12, 13, 19, 18, 21, 20, 29, 28]

It is so strange, is it? I just don't know how to explain it, and I need some help, Thank you very much.

like image 964
Harold Avatar asked Mar 26 '16 14:03

Harold


People also ask

Is sorting possible in HashSet?

Hence sorting of HashSet is not possible. However, the elements of the HashSet can be sorted indirectly by converting into List or TreeSet, but this will keep the elements in the target type instead of HashSet type.

How does the HashSet work internally?

HashSet internally uses HashMap to store it's elements. Whenever you create a HashSet object, one HashMap object associated with it is also created. This HashMap object is used to store the elements you enter in the HashSet. The elements you add into HashSet are stored as keys of this HashMap object.

Is HashSet by default sorted?

No, HashSet is not sorted - or at least, not reliably. You may happen to get ordering in some situations, but you must not rely on it. For example, it's possible that it will always return the entries sorted by "hash code modulo some prime" - but it's not guaranteed, and it's almost certainly not useful anyway.

Which data structure HashSet uses internally?

HashSet uses HashMap for storing its object internally.


2 Answers

A HashSet does not guarantee sorted iteration, but under very specific circumstances its internal data structure may act like a bucket sort.

Specifically, for integer keys in the range [0,65535] and a table size that is greater than the largest key, the index of the bucket a key is stored in is equal to the key itself, and since the iterator iterates in bucket order, it emits the elements in sorted order.

like image 106
meriton Avatar answered Sep 21 '22 15:09

meriton


There are some good answers all around, but none attempt to explain what exactly happens in this particular situation, so I'll limit my answer to that, rather than add another explanation of how the HashSet works. I'm taking that understanding as granted.

The default constructor of HashSet creates a set with a capacity of 16 and a load factor of 0.75. That means there are 16 bins, and this capacity is increased when you insert 16 * 0.75 = 12 unique elements.

That's why in the first case, the numbers are sorted by their remainder when divided by 16: the set started with a table size of 16, "hashing" each element to a bin by taking x % 16. Then when there were 12 elements, it grew the table and performed a rehash (see Javier Martin's answer if that's not clear), probably growing the table to 32. (I could only find information about how it grows in the java 6 doc, which states that the number of buckets is "approximately" doubled, whatever that means.) That gave each integer under 30 its own bin, so when the set iterated over each bin in order, it iterated over the numbers in order. If you inserted numbers below 64, you'd probably find that you need to insert 32*0.75 = 24 elements before the iteration appears sorted.

Also note that this way of assigning bins is not guaranteed behavior. HashSets in other Java versions/implementations might do something more complicated with the objects' hashCode() values than simply taking a remainder. (As noted by ruakh and fluffy in the comments - thanks!)

like image 38
MPeti Avatar answered Sep 25 '22 15:09

MPeti