Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whats the best way to obtain a concurrent hash set when write operations are excess than read operations?

I found that we can obtain a concurrent hash set using newKeySet(); or with keySet(default value) from a ConcurrentHashMap. Is this the best way to create a thread-safe set when write operations are excess than the read operation.

I read about CopyOnWriteArraySet seems its better when reading is excess than writing.

Welcomes all answers that may help us to know a little more on this.

like image 254
Renjith K N Avatar asked Aug 02 '19 07:08

Renjith K N


1 Answers

ConcurrentHashMap.newKeySet() and ConcurrentHashMap.keySet() that return a KeySetView rely on the ConcurrentHashMap class that locks at the writing only and only the key mapping concerned by the writing and not the whole map.
Consequently, reading operations are fast but writing too (very slightly slower in the facts).

CopyOnWriteArraySet that relies on CopyOnWriteArrayList doesn't lock for reading but locks for writing operations. As a consequence to guarantee the concurrent reading consistency each writing operation triggers a copy of the underlying array.
So in the case of not small collection (100 element or more for example), CopyOnWriteArraySet writing operations should be more expensive (lock on the whole collection + copy of the underlying array) than KeySetView (lock only and uniquely the concerned entry).

The CopyOnWriteArraySet javadoc underlines that point :

  • It is best suited for applications in which set sizes generally stay small, read-only operations vastly outnumber mutative operations, and you need to prevent interference among threads during traversal.

  • Mutative operations (add, set, remove, etc.) are expensive since they usually entail copying the entire underlying array


Here is a benchmark where we compare both behavior.
The Set<String>s are initialized with 100 elements ("0" to 99" value) at each iteration.
The 6 first methods are writing operations (remove, add a new element, overwrite an existing element).
while the 4 next methods are reading operations (iterate, contains).

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Threads(8)
public class SetBenchmark {

    private Set<String> keySetView;
    private Set<String> copyOnWriteArraySet;
    private Random random;

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(SetBenchmark.class.getSimpleName())
                                          .warmupIterations(5)
                                          .measurementIterations(5)
                                          .forks(1)
                                          .build();

        new Runner(opt).run();
    }

    @Setup(Level.Iteration)
    public void doSetup() {
        random = new Random(1);
        keySetView = ConcurrentHashMap.newKeySet();
        copyOnWriteArraySet = new CopyOnWriteArraySet<>();
        init(keySetView);
        init(copyOnWriteArraySet);

    }

    private void init(Set<String> set) {
        IntStream.range(0, 100)
                 .forEach(i -> {
                     final String string = String.valueOf((char) i);
                     set.add(string);
                 });

    }


    // Writing
    @Benchmark
    public void _1_keySetView_remove() {
        doRemove(keySetView);
    }

    @Benchmark
    public void _2_copyOnWriteArraySet_remove() {
        doRemove(copyOnWriteArraySet);
    }

    @Benchmark
    public void _3_keySetView_add_with_new_value() {
        doAddWithNewValue(keySetView);
    }

    @Benchmark
    public void _4_copyOnWriteArraySet_add_with_new_value() {
        doAddWithNewValue(copyOnWriteArraySet);
    }

    @Benchmark
    public void _5_keySetView_add_with_existing_value() {
        doAddWithExistingValue(keySetView);
    }

    @Benchmark
    public void _6_copyOnWriteArraySet_add_with_existing_value() {
        doAddWithExistingValue(copyOnWriteArraySet);
    }

    // Reading
    @Benchmark
    public void _7_keySetView_iterate() {
        String res = doIterate(keySetView);
    }

    @Benchmark
    public void _8_copyOnWriteArraySet_iterate() {
        String res = doIterate(copyOnWriteArraySet);
    }

    @Benchmark
    public void _9_keySetView_contains() {
        boolean res = doContains(keySetView);
    }

    @Benchmark
    public void _010_copyOnWriteArraySet_contains() {
        boolean res = doContains(copyOnWriteArraySet);
    }


    // Writing
    private void doRemove(Set<String> set) {
        set.remove(getRandomString());
    }

    private void doAddWithNewValue(Set<String> set) {
        set.add(getRandomString() + set.size());
    }

    private void doAddWithExistingValue(Set<String> set) {
        set.add(getRandomString());
    }

    // Reading
    private String doIterate(Set<String> set) {
        String result = "";
        for (String string : set) {
            result += string;
        }
        return result;
    }

    private boolean doContains(Set<String> set) {
        return set.contains(getRandomString());
    }

    // Random value with seed
    private String getRandomString() {
        return String.valueOf(random.nextInt(100));
    }

}

There is the result (lower score is better).

Writing operations :

Benchmark                                                   Mode  Cnt      Score   Error  Units
SetBenchmark._1_keySetView_remove                            avgt          61,659          ns/op
SetBenchmark._2_copyOnWriteArraySet_remove                   avgt         249,976          ns/op

SetBenchmark._3_keySetView_add_with_new_value                avgt         240,589          ns/op
SetBenchmark._4_copyOnWriteArraySet_add_with_new_value       avgt       30691,318          ns/op

SetBenchmark._5_keySetView_add_with_existing_value           avgt          84,472          ns/op
SetBenchmark._6_copyOnWriteArraySet_add_with_existing_value  avgt         473,592          ns/op

Reading operations :

Benchmark                                                   Mode  Cnt      Score   Error  Units
SetBenchmark._7_keySetView_iterate                           avgt       13603,012          ns/op
SetBenchmark._8_copyOnWriteArraySet_iterate                  avgt       13626,146          ns/op

SetBenchmark._9_keySetView_contains                          avgt          53,081          ns/op
SetBenchmark._10_copyOnWriteArraySet_contains                avgt         250,401          ns/op

Reading are as fast for the two Set implementations only for the iterator.
And writing is much slower for CopyOnWriteArraySet in any case but when we add() a not existing value, that is still worse.

like image 133
davidxxx Avatar answered Oct 14 '22 10:10

davidxxx