Fast memory access in C++? [closed]

1 Answers

Memory Performance is extremely vague.

I think that what you are looking for is about handling the CPU Cache as there is a factor of about 10 between an access in the cache and an access in the main memory.

For a complete reference on the mechanisms behind the cache, you might wish to read this excellent serie of articles by Ulrich Drepper on lwn.net.

In short:

Aim at Locality

You should not jump around in memory, so try (when possible) to group together items that will be used together.

Aim at Predictability

If your memory accesses are predictable, the CPU will likely prefetch the memory for the next chunk of work, so that it is available immediately, or shortly, after finishing the current chunk.

The typical example is with for loops on arrays:

for (int i = 0; i != MAX; ++i)
  for (int j = 0; j != MAX; ++j)
    array[i][j] += 1;

Change array[i][j] += 1; with array[j][i] += 1; and the performance varies... at low optimization levels ;)

The compiler should catch those obvious cases, but some are more insidious. For example, the use of Node Based containers (linked lists, binary search trees) instead of array-based containers (vector, some hash tables) may slow down the application.

Don't waste space... beware of false sharing

Try to pack your structures. This has to do with alignment, and you might be wasting space due to alignment issues within your structures, which artificially inflate the structure size and waste cache space.

A typical rule of thumb is to order the items in the structure by decreasing size (use sizeof). This is dumb, but works well. If you are more knowledgeable about the size and alignments, just avoid holes :) Note: only useful for structure with lots of instances...

However, beware of false sharing. In Multi Threaded programs, concurrent access to two variables that are close enough to share the same cache line is costly, because it involves a lot of cache invalidation and CPU battling for cache line ownership.

Profile

Unfortunately, this is HARD to figure out.

If you happen to be programming on Unix, Callgrind (part of the Valgrind suite) can be run with cache simulation and identify the parts of the code triggering the cache misses.

I guess that there are other tools, I just never used them.

answered Sep 25 '22 21:09

Matthieu M.

Related questions
                            
                                create file on desktop in c++
                            
                                MySQL C++ Connector: undefined reference to `get_driver_instance'
                            
                                Is there a standard way of representing an SHA1 hash as a C string, and how do I convert to it?
                            
                                How to check if __PRETTY_FUNCTION__ can be used?
                            
                                convert shared_ptr to auto_ptr?
                            
                                c++ for loop optimization question
                            
                                Simplest way to create a HWND
                            
                                Preventing header explosion in C++ (or C++0x)
                            
                                C++ commands executing out of order
                            
                                Nested templates vs shift operator
                            
                                Std::deque does not release memory until program exits
                            
                                Finding largest rectangle in 2D array
                            
                                Can't allocate 2-4 Gb of RAM with new[]/ C++/Linux/ x86_64
                            
                                Templates for code which is similar but not identical?
                            
                                Why strange behavior with casting back pointer to the original class?
                            
                                C++ double dispatch "extensible" without RTTI
                            
                                How to call a c function from name , which is stored in a char* pointer?
                            
                                How to query a running process for it's parameters list? (windows, C++)
                            
                                How to distinguish between bind() in sys/sockets.h and std::bind?
                            
                                Check for zero or a denormalized number in c++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Fast memory access in C++? [closed]

Tags:

c++

memory-management

memory

Tiago Costa

People also ask

1 Answers

Matthieu M.

Recent Activity

Donate For Us