Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrent HashMap iterator:How safe is it for Threading?

I used concurrent hashmap for creating a matrix. It indices ranges to 100k. I have created 40 threads. Each of the thread access those elements of matrices and modifies to that and write it back of the matrix as:

ConcurrentHashMap<Integer, ArrayList<Double>> matrix = 
    new ConcurrentHashMap<Integer, ArrayList<Double>>(25);

for (Entry(Integer,ArrayList<Double>)) entry: matrix.entrySet())
    upDateEntriesOfValue(entry.getValue());     

I did not found it thread safe. Values are frequently returned as null and my program is getting crashed. Is there any other way to make it thread safe.Or this is thread safe and i have bug in some other places. One thing is my program does not crash in single threaded mode.

like image 528
thetna Avatar asked Dec 09 '22 22:12

thetna


2 Answers

The iterator is indeed thread-safe for the ConcurrentHashMap.

But what is not thread-safe in your code is the ArrayList<Double> you seem to update! Your problems might come from this data structure.

You may want to use a concurrent data structure adapted to you needs.

like image 141
Jean Logeart Avatar answered Dec 11 '22 11:12

Jean Logeart


Using a map for a matrix is really inefficient, and the way you have used it, it won't even support sparse arrays particularly well.

I suggest you use a double[][] where you lock each row (or column if that is better) If the matrix is small enough you may be better of using only one CPU as this can save you quite a bit of overhead.

I would suggest you create no more threads than you have cores. For CPU intensive tasks, using more thread can be slower, not faster.

Matrix is 100k*50 at max

EDIT: Depending on the operation you are performing, I would try to ensure you have the shorter dimension first so you can process each long dimension in a different thread efficiently.

e.g

double[][] matrix = new double[50][100*1000];
for(int i=0;i<matrix.length;i++) {
   final double[] line = matrix[i];
   executorService.submit(new Runnable() {
       public void run() {
          synchronized(line) {
              processOneLine(line);
          }
       }
   });
}

This allows all you thread to run concurrently because they don't share any data structures. They can also access each double efficiently because they are continuous in memory and stored as efficiently as possible. i.e. 100K doubles uses about 800KB, but List<Double> uses 2800KB and each value can be randomly arranged in memory which means your cache has to work much harder.

thanks but in fact i have 80 cores in total

To uses 80 core efficiently you might want to break the longer lines in two or four so you can keep all the cores busy, or find a way to perform more than one operation at a time.

like image 28
Peter Lawrey Avatar answered Dec 11 '22 10:12

Peter Lawrey