Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my Java Merge-Sort faster than my C++ implementation?

I implemented the Merge Sort in Java and in C++ and I tried to implement them as similar as possible. Both algorithms work, I tested them many times. The problem is that my Java-Implementation is a lot faster than my C++-Implementation and I am wondering why. I can't believe that Java would be faster so I guess I made a mistake in one of the implementations. In order to measure the runtime I created a class "Person" which has two String-Attributes (forename, lastname). In C++ I used std::vector<Person*> and in Java I used ArrayList<Person>. Additionally I overloaded the operator< in C++ to compare two Persons (compare the lastname, if equal compare the firstname). In Java I implemented the interface Comparable<Person> to compare two Persons.

Can you find a mistake in my code or a reason why Java would be faster or C++ would be slower? Any help would be appreciated.

My Java Code:

public void mergeSort(List<T> list) {
    if (list.size() <= 1) {
        return;
    }

    int subLength = (int) (list.size() / 2);
    List<T> first = new ArrayList<T>(list.subList(0, subLength));
    List<T> second = new ArrayList<T>(list.subList(subLength, list.size()));


    mergeSort(first);
    mergeSort(second);

    merge(first, second, list);
    return;
}

private void merge(List<T> first, List<T> second, List<T> result) {
    int firstPos = 0, secondPos = 0, resultPos = 0;

    while (firstPos < first.size() && secondPos < second.size()) {
        if (first.get(firstPos).compareTo(second.get(secondPos)) < 0) {
            result.set(resultPos, first.get(firstPos));
            firstPos++;
        } else {
            result.set(resultPos, second.get(secondPos));
            secondPos++;
        }
        resultPos++;
    }

    for (int i = firstPos; i < first.size(); i++) {
        result.set(resultPos, first.get(i));
        resultPos++;
    }
    for (int i = secondPos; i < second.size(); i++) {
        result.set(resultPos, second.get(i));
        resultPos++; 
    }
}

My C++ Code:

Note: I used two template-methods to make the mergesort usable with both Person and Person*.

template<typename T>
    T * ptr(T & obj) { return &obj; } 

    template<typename T>
    T * ptr(T * obj) { return obj; }


void mergeSort(std::vector<T> &list) {
        if (list.size() <= 1) {
            return;
        }

        int subLength = (int)(list.size() / 2);
        std::vector<T> first(list.begin(), list.begin() + subLength);
        std::vector<T> second(list.begin() + subLength, list.end());

        mergeSort(first);
        mergeSort(second);

        merge(first, second, list);

    }
void merge(const std::vector<T> &first, const std::vector<T> &second, std::vector<T> &result) {
        int firstPos = 0, secondPos = 0, resultPos = 0;
        while (firstPos < first.size() && secondPos < second.size()) {
            if (*ptr(first[firstPos]) < *ptr(second[secondPos])) {
                result[resultPos] = first[firstPos];
                firstPos++;
            }
            else {
                result[resultPos] = second[secondPos];
                secondPos++;
            }
            resultPos++;
        }

        for (int i = firstPos; i < first.size(); i++) {
            result[resultPos] = first[i];
            resultPos++;
        }
        for (int i = secondPos; i < second.size(); i++) {
            result[resultPos] = second[i];
            resultPos++;
        }
    }

Edit1 and 2:

My setup-configuration:

I used one million, 10 millionen and 20 millionen persons to test the implementations. I doesn't really matter with how many Persons I test it with, Java is always faster.

And I test it the following way: I create persons and initialize my MergeSort-class. Then I start the measurement and I call my mergeSort-method. When the sorting is finished I stop the measurement. (Deleting is happening after the time measurement). For the time measurement in Java I use System.nanoTime() and in C++ I use chrono::high_resolution_clock::time_point

Of course I compiled C++ in "Release"-Mode (Optimization: O2, faster code preferred).

My Test-PC:

  • Intel i5 2500
  • 8GB DDR3 RAM
  • Windows 10 Education

Edit3:

There is one thing I forgot to mention. I implemented the algorithm in a generic way in order to use simple data-types as well as objects. When I use std::vector<int> and ArrayList<Integer> in Java then my C++ implementation is a lot faster. My first attempt was to use std::vector<Person> but it was even slower. So my guess was that it made deep copies instead of shallow ones and that's why I switched to Person* because I thought that when copying is happening only the pointers will be copied.

like image 264
killexe Avatar asked Feb 06 '23 10:02

killexe


1 Answers

TL;DR - The Java version does alot less array copying.

The ArrayList constructor (see line 167) of ArrayList when passed a Collection uses Collection.toArray() and optionally Arrays.copyOf if required. In the case of an ArrayList there is no need to take a copy - toArray() returns a reference to the underlying array.

Note also that the if (elementData.getClass() != Object[].class) will cause the array not to be copied again.

The Java List.subList on ArrayList objects does not copy any of the underlying array, it just returns an ArrayList backed by the original but limited to the elements required.

As a result - in may cases the whole mechanism uses sub-lists that actually just refer to the original array - no copying is required.

Not too familiar with C++ but I suspect there is a lot of array copying and allocation going on which just doesn't need to be done by Java.

ADDED - As @ThomasKläger rightly pointed out, ArrayList.toArray actually does return a defensive copy of the array - so I was wrong above.

like image 150
OldCurmudgeon Avatar answered Feb 08 '23 15:02

OldCurmudgeon