Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrent collection size calculation

The documentation of most collections in the Java standard library such as ConcurrentLinkedQueue, ConcurrentLinkedDequeue and ConcurrentSkipListSet come with the following disclaimer:

Beware that, unlike in most collections, the size method is not a constant-time operation. Because of the asynchronous nature of these sets, determining the current number of elements requires a traversal of the elements, and so may report inaccurate results if this collection is modified during traversal.

What does that mean? Why can't they keep a counter (say, an AtomicInteger) and just return the value on calls to size()?

Is it because the counter must be synchronized and therefore creates a choke point?

As a side note, ConcurrentHashMap does not seem to have this issue. Why is that? Looking at the source code, it seems like it uses multiple counters kept in an array that are summed on calls to size(). Is that to circumvent the choke point or is there another reason?

like image 779
Gustav Karlsson Avatar asked May 19 '15 19:05

Gustav Karlsson


1 Answers

Using a shared resource to maintain the size() would be both expensive and rather useless. size() is likely to be incorrect as soon as it returns as you cannot hold a lock on the collection so it can change between you calling it and when you get the value.

ConcurentHashMap has the same approach. The size() can be incorrect before the method has returned.

like image 159
Peter Lawrey Avatar answered Nov 15 '22 17:11

Peter Lawrey