Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unable to reproduce: C++ Vector performance advantages over C# List performance

Tags:

At Microsoft's BUILD conference Herb Sutter explained that C++ has "Real Arrays" and C#/Java languages do not have the same or sort of.

I was sold on that. You can watch the full talk here http://channel9.msdn.com/Events/Build/2014/2-661

Here is a quick snapshot of the slide where he described this. http://i.stack.imgur.com/DQaiF.png

But I wanted to see how much difference will I make.

So I wrote very naive programs for testing, which create a large vector of strings from a file with lines ranging from 5 characters to 50 characters.

Link to the file:

www (dot) dropbox.com/s/evxn9iq3fu8qwss/result.txt

Then I accessed them in sequence.

I did this exercise in both C# and C++.

Note: I made some modifications, removed the copying in the loops as suggested. Thank you for helping me to understand the Real arrays.

In C# I used both List and ArrayList because ArrayList it is deprecated in favor of List.

Here are the results on my dell laptop with Core i7 processor:

count       C# (List<string>)   C# (ArrayList)     C++   

1000           24 ms              21 ms             7 ms       
10000         214 ms             213 ms            64 ms     
100000  2 sec 123 ms       2 sec 125 ms           678 ms    

C# code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections;
namespace CSConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            int count;
            bool success = int.TryParse(args[0], out count);

            var watch = new Stopwatch();
            System.IO.StreamReader isrc = new System.IO.StreamReader("result.txt");

            ArrayList list = new ArrayList();
            while (!isrc.EndOfStream)
            {
                list.Add(isrc.ReadLine());
            }
            double k = 0;
            watch.Start();
            for (int i = 0; i < count; i++)
            {
                ArrayList temp = new ArrayList();
                for (int j = 0; j < list.Count; j++)
                {
                   // temp.Add(list[j]);
                    k++;
                }
            }

            watch.Stop();
            TimeSpan ts = watch.Elapsed;

            //Console.WriteLine(ts.ToString());
            Console.WriteLine("Hours: {0} Miniutes: {1} Seconds: {2} Milliseconds: {3}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds);
            Console.WriteLine(k);
            isrc.Close();
        }


    }
}

C++ code

#include "stdafx.h"
#include <stdio.h>
#include <tchar.h>

#include <vector>
#include <fstream>
#include <chrono>
#include <iostream>
#include <string>

using namespace std;

std::chrono::high_resolution_clock::time_point time_now()
{
    return std::chrono::high_resolution_clock::now();
}

float time_elapsed(std::chrono::high_resolution_clock::time_point const & start)
{

    return std::chrono::duration_cast<std::chrono::milliseconds>(time_now() - start).count();
    //return std::chrono::duration_cast<std::chrono::duration<float>>(time_now() - start).count();
}


int _tmain(int argc, _TCHAR* argv [])
{
    int  count = _wtoi(argv[1]);

    vector<string> vs;
    fstream fs("result.txt", fstream::in);
    if (!fs) return -1;

    char* buffer = new char[1024];
    while (!fs.eof())
    {
        fs.getline(buffer, 1024);
        vs.push_back(string(buffer, fs.gcount()));
    }
    double k = 0;
    auto const start = time_now();
    for (int i = 0; i < count; i++)
    {
        vector<string> vs2;
        vector<string>::const_iterator iter;
        for (iter = vs.begin(); iter != vs.end(); iter++)
        {
            //vs2.push_back(*iter);
            k++;
        }
    }

    auto const elapsed = time_elapsed(start);
    cout << elapsed << endl;
    cout << k;
    fs.close();
    return 0;
}
like image 928
Gangadhar Chowdary Payyavula Avatar asked Apr 08 '14 01:04

Gangadhar Chowdary Payyavula


People also ask

Are C style arrays faster than vectors?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

Is vector better than array C++?

Vector is better for frequent insertion and deletion, whereas Arrays are much better suited for frequent access of elements scenario. Vector occupies much more memory in exchange for managing storage and growing dynamically, whereas Arrays are a memory-efficient data structure.

Is vector faster than set C++?

The time complexity for the insertion of a new element is O(log N). Vector is faster for insertion and deletion of elements at the end of the container. Set is faster for insertion and deletion of elements at the middle of the container.

How much slower are vectors than arrays?

That's about 3 - 4 times slower!


2 Answers

The differences found by your sample program has nothing to do with lists or their structure.

It's because in C#, strings are a reference type, whereas in C++ code, you are using them as a value type.

For example:

string a = "foo bar baz";
string b = a;

Assigning b = a is just copying the pointer.

This follows through into lists. Adding a string to a C# list is just adding a pointer to the list. In your main loop, you create N lists, all of which just contain pointers to the same strings.

Because you're using strings by value in C++ however, it has to copy them each time.

vector<string> vs2;
vector<string>::const_iterator iter;
for (iter = vs.begin(); iter != vs.end(); iter++)
{
   vs2.push_back(*iter); // STRING IS COPIED HERE
}

This code is actually making copies of each string. You end up with copies of all the strings, and will use a lot more memory. This is slower for obvious reasons.

If you rewrite the loop as follows:

vector<string*> vs2;
for (auto& s : vs)
{
    vs2.push_back(&(s));
}

Then you're now creating lists-of-pointers not lists-of-copies and are on equal footing with C#.

On my system, the C# program runs with N of 1000 in about 138 milliseconds, and the C++ one runs in 44 milliseconds, a clear win to C++.


Commentary:

One of the main benefits of C++ vectors as per herb sutter's picture, is that the memory layout can be contiguous (i.e. all the items are stuck next to each other in memory). You'll never see this work with a std::string however, as strings require dynamic memory allocation (you can't stick a load of strings next to each other in an array because each string has a different length)

This would give a large benefit if you wanted to quickly iterate through them all, as it's much friendlier to CPU caches, but the tradeoff is that you have to copy all the items to get them into the list.

Here's an example which better illustrates it:

C# Code:

class ValueHolder {
    public int tag;
    public int blah;
    public int otherBlah;

    public ValueHolder(int t, int b, int o)
    { tag = t; blah = b; otherBlah = o; }
};

static ValueHolder findHolderWithTag(List<ValueHolder> buffer, int tag) {
    // find holder with tag i
    foreach (var holder in buffer) {
        if (holder.tag == tag)
            return holder;
    }
    return new ValueHolder(0, 0, 0);
}

static void Main(string[] args)
{
    const int MAX = 99999;

    int  count = 1000; // _wtoi(argv[1]);
    List<ValueHolder> vs = new List<ValueHolder>();
    for (int i = MAX; i >= 0; i--) {
        vs.Add(new ValueHolder(i, 0, 0)); // construct valueholder with tag i, blahs of 0
    }

    var watch = new Stopwatch();
    watch.Start();

    for (int i = 0; i < count; i++)
    {
        ValueHolder found = findHolderWithTag(vs, i);
        if (found.tag != i)
            throw new ArgumentException("not found");
    }

    watch.Stop();
    TimeSpan ts = watch.Elapsed;
    Console.WriteLine("Hours: {0} Miniutes: {1} Seconds: {2} Milliseconds: {3}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds);
}

C++ Code:

class ValueHolder {
public:
    int tag;
    int blah;
    int otherBlah;

    ValueHolder(int t, int b, int o) : tag(t), blah(b), otherBlah(o) { }
};

const ValueHolder& findHolderWithTag(vector<ValueHolder>& buffer, int tag) {
    // find holder with tag i
    for (const auto& holder : buffer) {
        if (holder.tag == tag)
            return holder;
    }
    static ValueHolder empty{ 0, 0, 0 };
    return empty;
}

int _tmain(int argc, _TCHAR* argv[])
{
    const int MAX = 99999;

    int  count = 1000; // _wtoi(argv[1]);
    vector<ValueHolder> vs;
    for (int i = MAX; i >= 0; i--) {
        vs.emplace_back(i, 0, 0); // construct valueholder with tag i, blahs of 0
    }

    auto const start = time_now();
    for (int i = 0; i < count; i++)
    {
        const ValueHolder& found = findHolderWithTag(vs, i);
        if (found.tag != i) // need to use the return value or compiler will optimize away
            throw "not found";
    }

    auto const elapsed = time_elapsed(start);
    cout << elapsed << endl;
    return 0;
}

We already know from the original question that creating a bunch of duplicate lists would be much faster in C# than in C++, but what about searching the list instead?

Both programs are just doing a stupid linear list-scan in a simple attempt to show this.

On my PC, the C++ version runs in 72ms and the C# one takes 620ms. Why the speed increase? Because of the "real arrays".

All the C++ ValueHolders are stuck next to eachother in the std::vector. When the loop wants to read the next one, this means it's most likely already in the CPU cacue.

The C# ValueHolders are in all kinds of random memory locations, and the list just contains pointers to them. When the loop wants to read the next one, it is almost certainly not in the CPU cache, so we have to go and read it. Memory access is slow, hence the C# version takes nearly 10x as long.

If you change the C# ValueHolder from class to struct, then the C# List can stick them all next to eachother in memory, and the time drops to 161ms. Buuut now it has to make copies when you're inserting into the list.

The problem for C# is that there are many many situations where you can't or don't want to use a struct, whereas in C++ you have more control over this kind of thing.

PS: Java doesn't have structs, so you can't do this at all. You're stuck with the 10x as slow cache unfriendly version

like image 174
Orion Edwards Avatar answered Oct 06 '22 00:10

Orion Edwards


While C# string and C++ std::string share a name, and both are ordered collections of character types, they are very different otherwise.

std::string is a mutable container, C#'s string is immutable. This means the data in two C# strings can be shared: copying a C# string duplicates a pointer (aka reference), and does knock-on lifetime management work. Duplicating a std::string copies the contents (copy on write C++ std::string are not legal).

To create a more similar set of C++ code, store std::shared_ptr<const std::string> in your vectors. And when you stuff the duplicate vectors, make sure you only copy the smart pointers. Now you have (smart) pointers to immutable data, just like C#.

Note that this will improve copy performance probably in C++, but will make most other operations slower (if want to you edit the std::string you now have to copy the entire thing, then modify the copy, and read does an extra pointer dereference).

struct immutable_string {
  immutable_string() = default;
  immutable_string(const&) = default;
  immutable_string(&&) = default;
  immutable_string& operator=(const&) = default;
  immutable_string& operator=(&&) = default;

  immutable_string( std::string s ):data( std::make_shared<std::string>(std::move(s)) ) {}

  std::string const& get() const { if (data) return *data; return *empty_string(); }
  std::string get() && {
    if (!data) return {};
    if (data->use_count()==1) return std::move( const_cast<std::string&>(*data) );
    return *data;
  }
  template<class... Args>
  void emplace( Args&&... args ) {
    data = std::make_shared<std::string>(std::forward<Args>(args)...);
  }
  template<class F>
  void modify( F&& f ) {
    std::string tmp = std::move(*this).get();
    std::forward<F>(f)(tmp);
    emplace(std::move(tmp));
  }
private:
  std::shared_ptr<const std::string> data;
  static std::shared_ptr<const std::string> empty_string() {
    static auto nothing = std::make_shared<const std::string>();
    return nothing;
  }
};

this gives us COW. To edit, call immutable_string s; s.modify( [&](std::string& s){ code goes here });.

like image 41
Yakk - Adam Nevraumont Avatar answered Oct 05 '22 23:10

Yakk - Adam Nevraumont