I am planning to participate in development of a code written in C language for Monte Carlo analysis of complex problems. This codes allocates huge data arrays in memory to speed up its performance, therefore the author of the code has chosen C instead of C++ claiming that one can make faster and more reliable (concerning memory leaks) code with C.
Do you agree with that? What would be your choice, if you need to store 4-16 GB of data arrays in memory during calculation?
Definitely C++. By default, there's no significant difference between the two, but C++ provides a couple of things C doesn't:
The bottom line is that in this respect, C provides absolutely no possibility of an advantage over C++. In the very worst case, you can do exactly the same things in the same ways.
There is one feature of C99 that's absent from C++ and that potentially gives significant speed gains in heavy number-crunching code, and that is keyword restrict
. If you can use a C++ compiler that supports it, then you have an extra tool in the kit when it comes to optimizing. It's only a potential gain, though: sufficient inlining can allow the same optimizations as restrict
and more. It also has nothing to do with memory allocation.
If the author of the code can demonstrate a performance difference between C and C++ code allocating a 4-16GB array, then (a) I'm surprised, but OK, there's a difference, and (b) how many times is the program going to allocate such large arrays? Is your program actually going to spend a significant amount of its time allocating memory, or is it spending most of its time accessing memory and doing computations? It takes a long time to actually do anything with a 4GB array, compared with the time it took to allocate, and that means you should be worried about the performance of "anything", not the performance of allocation. Sprinters care a lot how quickly they get off the blocks. Marathon runners, not so much.
You also have to be careful how you benchmark. You should be comparing for example malloc(size)
against new char[size]
. If you test malloc(size)
against new char[size]()
then it's an unfair comparison since the latter sets the memory to 0 and the former doesn't. Compare against calloc
instead, but also note that malloc
and calloc
are both available from C++ in the (unlikely) event that they do prove measurably faster.
Ultimately, though, if the author "owns" or started the project, and prefers to write in C rather than C++, then he shouldn't justify that decision with probably-spurious performance claims, he should justify it by saying "I prefer C, and that's what I'm using". Usually when someone makes a claim like this about language performance, and it turns out on testing not to be true, you discover that performance is not the real reason for the language preference. Proving the claim false will not actually cause the author of this project to suddenly start liking C++.
There is no real difference between C and C++ in terms of memory allocation. C++ Has more 'hidden' data, such as virtual pointers and so on, if you chose to have virtual methods on your objects. But allocating an array of chars is just as expensive in C as in C++, in fact, they're probably both using malloc to do it. In terms of performance, C++ calls a constructor for each object in the array. Note that this is only done if there is one, the default constructor does nothing and is optimized away.
As long as you're preallocating pools of data, to avoid memory fragmentation, you should be good to go. If you have simple POD-structs without virtual methods, and without constructors, there's no difference.
The only thing in disfavor of C++ is it's additional complexity - combine that with a programmer who uses it incorrectly, and you can easily slow down notably. Using a C++ compiler without C++ features will give you the same performance. Using C++ correctly, you have some posisbilities to be faster.
The language isn't your problem, allocating and traversing large arrays is.
The main deadly mistake you could make in allocation (in either language) is allocating 16G of memory, initializing it to zero, only to fill it with actual values later.
The most performance gains I'd expect from algorithmic optimizations that improve locality of reference.
Depending on the underlying OS, you may also affect caching algorithms - e.g. indicating that a range of memroy is processed only sequentially.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With