Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bug: parameter 'initialCapacity' of ConcurrentHashMap's construct method?

one of the construct method of java.util.concurrent.ConcurrentHashMap:

public ConcurrentHashMap(int initialCapacity) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException();
        int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1)) ?
                   MAXIMUM_CAPACITY :
                   tableSizeFor(initialCapacity + (initialCapacity >>> 1) + 1));
        this.sizeCtl = cap;
    }

What does the parameter for method 'tableSizeFor(...)' mean?

initialCapacity + (initialCapacity >>> 1) + 1

I think the parameter should be like :

(int)(1.0 + (long)initialCapacity / LOAD_FACTOR)

or just:

initialCapacity

I think the parameter expression is wrong, at least is a bug.Did I misunderstand something?

I send a bug report to OpenJDK, seems they officially confirmed it is most likely a bug: https://bugs.openjdk.java.net/browse/JDK-8202422

Update: Doug Lea commented on the bug,seems that he agree it is a bug.

like image 456
Anonemous Avatar asked Apr 29 '18 06:04

Anonemous


1 Answers

I strongly suppose it’s an optimization trick.

You’re on to the correct thought. The constructor you cite uses a the default load factor of 0.75, so to accommodate initialCapacity elements the hash table size needed to be at least

initialCapacity / 0.75

(roughly the same as multiplying by 1.3333333333). However floating-point divisions are expensive (a slight bit, not bad). And we would additionally need to round up to an integer. I guess that an integer division would already help

(initialCapacity * 4 + 2) / 3

(the + 2 is for making sure that the result is rounded up; the * 4 ought to be cheap since it can be implemented as a left shift). The implementors do even better: shifts are a lot cheaper than divisions.

initialCapacity + (initialCapacity >>> 1) + 1

This is really multiplying by 1.5, so is giving us a result that will often be greater than needed, but it’s fast. The + 1 is to compensate for the fact that the “multiplication” rounded down.

Details: the >>> is an unsigned right shift, filling a zero into the leftmost position. Already knowing that initialCapacity was non-negative this gives the same result as a division by 2, ignoring the remainder.

Edit: I may add that tableSizeFor rounds up to a power of 2, so most often the same power of 2 will be the final result even when the first calculation gave a slightly greater result than needed. For example, if you ask for capacity for 10 elements (to keep the calculation simple), table size 14 would be enough, where the formula yields 16. But the 14 would be rounded up to a power of 2, so we get 16 anyway, so in the end there is no difference. If you asked for room for 12 elements, size 16 would still suffice, but the formula yields 19, which is then rounded up to 32. This is the more unusual case.

Further edit: Thank you for the information in the comments that you have submitted this as a JDK bug and for providing the link: https://bugs.openjdk.java.net/browse/JDK-8202422. The first comment by Marin Buchholz agrees with you:

Yes, there is a bug here. The one-arg constructor effectively uses a load-factor of 2/3, not the documented default of 3/4…

I myself would not have considered this a bug unless you regard it as a bug that you occasionally get a greater capacity than you asked for. On the other hand you are right, of course (in your exemplarily terse bug report) that there is an inconsistency: You would expect new ConcurrentHashMap(22) and new ConcurrentHashMap(22, 0.75f, 1) to give the same result since the latter just gives the documented default load factor/table density; but the table sizes you get are 64 from the former and 32 from the latter.

like image 68
Ole V.V. Avatar answered Sep 20 '22 11:09

Ole V.V.