I wrote a simple C++ code to check the speed of sorting data , represented in the form of a list and then a vector.
In the case of the list I am getting time as 27 seconds. For a vector I get 10 seconds. Why the huge performance gap? Aren't the algorithms used for sorting the list and the vector the same? viz. mergesort?
EDIT: I may be wrong on the last point. As I know, textbooks when descirbing sorting algorithms theoretically, seem to be use the word list
in the sense of a std::vector
. I don't know how
how sorting algorithms for vectors would be different from sorting algorithms for lists, so if some one could clarify that would be really helpful. Thank you.
//In this code we compare the sorting times for lists and vectors.
//Both contain a sequence of structs
#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <time.h>
#include <math.h>
#include <stdlib.h>
#include <iomanip>
using namespace std;
struct particle
{
double x;
double y;
double z;
double w;
bool operator<(const particle& a) const
{
return x < a.x;
}
};
int main(int argc, char *argv[])
{
int N=20000000;
clock_t start,stop;
vector<particle> myvec(N);
vector<particle>::iterator cii;
//Set vector values
for (cii = myvec.begin(); cii != myvec.end(); ++cii)
{
cii->x =1.0*rand()/RAND_MAX;
cii->y =1.0*rand()/RAND_MAX;
cii->z =1.0*rand()/RAND_MAX;
cii->w =1.0*rand()/RAND_MAX;
}
list<particle> mylist(N);
list<particle>::iterator dii;
//Set list values
for (cii=myvec.begin(),dii = mylist.begin(); dii != mylist.end() && cii!=myvec.end(); ++dii, ++cii)
{
dii->x =cii->x;
dii->y =cii->y;
dii->z =cii->z;
dii->w =cii->w;
}
//Sort the vector
start=clock();
sort(myvec.begin(),myvec.end());
stop=clock();
cout<<"Time for sorting vector "<<(stop-start)/(double) CLOCKS_PER_SEC<<endl;
//Sort the list
start=clock();
mylist.sort();
stop=clock();
cout<<"Time for sorting list "<<(stop-start)/(double) CLOCKS_PER_SEC<<endl;
return 0;
}
std::vector is insanely faster than std::list to find an element. std::vector always performs faster than std::list with very small data. std::vector is always faster to push elements at the back than std::list.
Both should run in O(n log n), but in general sort is faster than stable_sort.
All data structures are impacted the same way when the data size increases, because there will be more memory to allocate. The vector_pre is clearly the winner of this test, being one order of magnitude faster than a list and about twice faster than a vector without pre-allocation.
In general, std::sort is indeed faster than qsort because of a couple of these things: qsort operates on void* , which first requires a dereference, and second requires the size of the data type to perform the swaps.
list::sort
and std::sort
on vectors don't use the same algorithm.
std::sort
uses a sorting algorithm that requires random-access iterators, such as the ones required by std::vector
, but not by std::list
.
list::sort
is specialized for lists; it usually implements a merge sort, which does not require random access.
The total number of comparisons is O(n log n) for both algorithms (I say that without knowing the exact algorithm used by my compiler's std::sort
implementation).
The total number of swaps is O(n log n) as well, but for std::sort
, that means O(n log n) calls to copy constructor/assignment operator, whereas for list::sort
, it's a pointer operation. Your structure is way too small for this advantage to pay off.
I assume that as soon as you put something with a non-trivial copy constructor into the struct (maybe a std::string
is enough), std::list
will win.
EDIT: One std::string member initialised with a random double converted to text seems to be about the break-even point on my machine (x86_64-linux, gcc 4.6.2)
A vector packs things closer in memory than a list does. This results in a more cache-friendly access pattern during sorting.
No a std::vector
is not sorted using merge sort (in most implementations; the standard doesn't specify the algorithm).
std::list
does not have O(1) random access, so it cannot use algorithms like Quick sort* which requires O(1) random access to be fast (this is also why std::sort
doesn't work on std::list
.)
With this, you'll have to use algorithms that forward iteration is enough, such as the Merge sort**.
And merge sort is typically slower [1][2].
See also: what's the difference between list.sort and std::sort?
*: libstdc++ actually uses introsort.
**: libstdc++ actually uses a variant of merge sort
I'm really not a C++ programmer, but my understanding is that std::vector
has different performance characteristics from std::list
. Specifically (as @Martinho commented):
std::vector
has O(1) random access, whilestd::list
doesn't.
From cplusplus.com (I'm sure there are less sketchy references out there, feel free to chime in):
Vectors are good at:
- Accessing individual elements by their position index (constant time).
- Iterating over the elements in any order (linear time).
- Add and remove elements from its end (constant amortized time).
and:
...Advantages to list containers:
- Efficient insertion and removal of elements anywhere in the container (constant time).
- Efficient moving elements and block of elements within the container or even between different containers (constant time).
- Iterating over the elements in forward or reverse order (linear time).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With