Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does parallel processing overcome sequential processing?

//    parallel processing

    int processors = Runtime.getRuntime().availableProcessors();
    ExecutorService executorService = Executors.newFixedThreadPool(threads);


    final List<String> albumIds2 = new ArrayList<String>();
    long start2 = System.nanoTime();
    for (final HColumn<String, String> column : result.get().getColumns()) {
        Runnable worker = new Runnable() {

            @Override
            public void run() {
                albumIds2.add(column.getName());
            }
        };
        executorService.execute(worker);
    }
    long timeTaken2 = System.nanoTime() - start2;

i have code like the above example which creates a List<String> of album ids. the column is a slice from cassandra database. i record the time taken for the whole list of albums to be created.

the same i have done using the enhanced for loop like below.

        QueryResult<ColumnSlice<String, String>> result =  CassandraDAO.getRowColumns(AlbumIds_CF, customerId);
    long start = System.nanoTime();
    for (HColumn<String, String> column : result.get().getColumns()) {
        albumIds.add(column.getName());
    }
    long timeTaken = System.nanoTime() - start;

i am noting that no matter how large the number of albums, the for each loop always taking a shorter time to complete. Am i doing it wrong? or do i need a computer with multiple cores. I am really new to the whole concept of parallel computing please do pardon me if my question is stupid.

like image 871
qualebs Avatar asked Mar 17 '13 08:03

qualebs


1 Answers

In your paralell example, you are submitting one task for each column. The overhead of enqueing the task is probably greater than the benefit of paralell execution. This is exacerbated by the "task" being really a fast one (insert a single element into an array and return). Indeed, the Executor adds each received task into a queue (and that addition is costly). Then you are adding N task to a queue, and each task adds an element to the array. The concurrent operation performs only the latter part

If the task were more complex, you could submit the work in "chunks" (for instance, if you have N elements and P processors, each chunk would have N/P elements or N/P+1 elements). That strategy helps reducing the overhead.

Note also that ArrayList is not synchronized, then the concurrent execution of several tasks may corrupt your list. You could use a concurrent collection for avoiding this issue, but the first observation remains.

like image 132
Javier Avatar answered Oct 17 '22 14:10

Javier