Why is my multi threaded sorting algorithm not faster than my single threaded mergesort

Question

There are certain algorithms whose running time can decrease significantly when one divides up a task and gets each part done in parallel. One of these algorithms is merge sort, where a list is divided into infinitesimally smaller parts and then recombined in a sorted order. I decided to do an experiment to test whether or not I could I increase the speed of this sort by using multiple threads. I am running the following functions in Java on a Quad-Core Dell with Windows Vista.

One function (the control case) is simply recursive:

// x is an array of N elements in random order
public int[] mergeSort(int[] x) {
    if (x.length == 1) 
        return x;

    // Dividing the array in half
    int[] a = new int[x.length/2];
    int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
    for(int i = 0; i < x.length/2; i++) 
        a[i] = x[i];
    for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++) 
        b[i] = x[i+x.length/2];

    // Sending them off to continue being divided
    mergeSort(a);
    mergeSort(b);

    // Recombining the two arrays
    int ia = 0, ib = 0, i = 0;
    while(ia != a.length || ib != b.length) {
        if (ia == a.length) {
            x[i] = b[ib];
            ib++;
        }
        else if (ib == b.length) {
            x[i] = a[ia];
            ia++;
        }
        else if (a[ia] < b[ib]) {
            x[i] = a[ia];
            ia++;
        }
        else {
            x[i] = b[ib];
            ib++;
        }
        i++;
    }

    return x;
}

The other is in the 'run' function of a class that extends thread, and recursively creates two new threads each time it is called:

public class Merger extends Thread
{
    int[] x;
    boolean finished;

    public Merger(int[] x)
    {
        this.x = x;
    }

    public void run()
    {
        if (x.length == 1) {
            finished = true;
            return;
        }

        // Divide the array in half
        int[] a = new int[x.length/2];
        int[] b = new int[x.length/2+((x.length%2 == 1)?1:0)];
        for(int i = 0; i < x.length/2; i++) 
            a[i] = x[i];
        for(int i = 0; i < x.length/2+((x.length%2 == 1)?1:0); i++) 
            b[i] = x[i+x.length/2];

        // Begin two threads to continue to divide the array
        Merger ma = new Merger(a);
        ma.run();
        Merger mb = new Merger(b);
        mb.run();

        // Wait for the two other threads to finish 
        while(!ma.finished || !mb.finished) ;

        // Recombine the two arrays
        int ia = 0, ib = 0, i = 0;
        while(ia != a.length || ib != b.length) {
            if (ia == a.length) {
                x[i] = b[ib];
                ib++;
            }
            else if (ib == b.length) {
                x[i] = a[ia];
                ia++;
            }
            else if (a[ia] < b[ib]) {
                x[i] = a[ia];
                ia++;
            }
            else {
                x[i] = b[ib];
                ib++;
            }
            i++;
        }

        finished = true;
    }
}

It turns out that function that does not use multithreading actually runs faster. Why? Does the operating system and the java virtual machine not "communicate" effectively enough to place the different threads on different cores? Or am I missing something obvious?

SyntaxT3rr0r · Accepted Answer

The problem is not multi-threading: I've written a correctly multi-threaded QuickSort in Java and it owns the default Java sort. I did this after witnessing a gigantic dataset being process and had only one core of a 16-cores machine working.

One of your issue (a huge one) is that you're busy looping:

 // Wait for the two other threads to finish 
 while(!ma.finished || !mb.finished) ;

This is a HUGE no-no: it is called busy looping and you're destroying the perfs.

(Another issue is that your code is not spawning any new threads, as it has already been pointed out to you)

You need to use other way to synchronize: an example would be to use a CountDownLatch.

Another thing: there's no need to spawn two new threads when you divide the workload: spawn only one new thread, and do the other half in the current thread.

Also, you probably don't want to create more threads than there are cores availables.

See my question here (asking for a good Open Source multithreaded mergesort/quicksort/whatever). The one I'm using is proprietary, I can't paste it.

Multithreaded quicksort or mergesort

I haven't implemented Mergesort but QuickSort and I can tell you that there's no array copying going on.

What I do is this:

pick a pivot
exchange values as needed
have we reached the thread limit? (depending on the number of cores)
- yes: sort first part in this thread
- no: spawn a new thread
sort second part in current thread
wait for first part to finish if it's not done yet (using a CountDownLatch).

The code spawning a new thread and creating the CountDownLatch may look like this:

            final CountDownLatch cdl = new CountDownLatch( 1 );
            final Thread t = new Thread( new Runnable() {
                public void run() {
                    quicksort(a, i+1, r );
                    cdl.countDown();
                }
            } };

The advantage of using synchronization facilities like the CountDownLatch is that it is very efficient and that your not wasting time dealing with low-level Java synchronization idiosynchrasies.

In your case, the "split" may look like this (untested, it is just to give an idea):

if ( threads.getAndIncrement() < 4 ) {
    final CountDownLatch innerLatch = new CountDownLatch( 1 );
    final Thread t = new Merger( innerLatch, b );
    t.start();
    mergeSort( a );
    while ( innerLatch.getCount() > 0 ) {
        try {
            innerLatch.await( 1000, TimeUnit.SECONDS );
        } catch ( InterruptedException e ) {
            // Up to you to decide what to do here
        }
    }
} else {
    mergeSort( a );
    mergeSort( b );
}

(don't forget to "countdown" the latch when each merge is done)

Where you'd replace the number of threads (up to 4 here) by the number of available cores. You may use the following (once, say to initialize some static variable at the beginning of your program: the number of cores is unlikely to change [unless you're on a machine allowing CPU hotswapping like some Sun systems allows]):

Runtime.getRuntime().availableProcessors()

Julien Rentrop · Answer

As others said; This code isn't going to work because it starts no new threads. You need to call the start() method instead of the run() method to create new threads. It also has concurrency errors: the checks on the finished variable are not thread safe.

Concurrent programming can be pretty difficult if you do not understand the basics. You might read the book Java Concurrency in Practice by Brian Goetz. It explains the basics and explains constructs (such as Latch, etc) to ease building concurrent programs.

Why is my multi threaded sorting algorithm not faster than my single threaded mergesort

Tags:

java

multithreading

parallel-processing

Robz

2 Answers

SyntaxT3rr0r

Julien Rentrop

Recent Activity

Donate For Us

Why is my multi threaded sorting algorithm not faster than my single threaded mergesort

Tags:

java

multithreading

parallel-processing

Robz

2 Answers

SyntaxT3rr0r

Julien Rentrop

Related questions

Recent Activity

Donate For Us