Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C - cache lines and association

Context

Read papers about cache optimizations (association with a cache line in loops..)

The question is related to this context : array of 1024 integers.

Sizes : cpu cache 64k, cache line 32bytes, integer size : 4 bytes.

intel core 2 duo

Question

According to my cpu, 8 integers fit in a cache line.

[0,1,2,3,4,5,6,7,8,9,10,...,1023]
         ^
If I want to access 4 and go downward, 3,2,1 and 0 will be loaded already. 5,6,7 are loaded uselessly.

[0,1,2,3,4,5,6,7,8,..,1023]
               ^
If I want to access 7 and go downward, all the next elements will be in cache already. if I want to go upward, according to my cpu I will have to load another cache line immediatly after the arr[7] read.

Am I correct ?

Going further

But what tells me that arr[4] is not at an address that will cause a cache line load instead of arr[7] ? If my statement is true, we should not only consider the in-array alignment, but the whole memory alignment of the program to minimize the cache waste, right ?

like image 973
Larry Avatar asked Nov 13 '14 15:11

Larry


People also ask

What are lines in cache?

When the processor accesses a part of memory that is not already in the cache it loads a chunk of the memory around the accessed address into the cache, hoping that it will soon be used again. The chunks of memory handled by the cache are called cache lines.

How many lines does a fully associative cache have?

The cache has 256 total cache lines, which are separated into four ways, each containing 64 cache lines. The cache line contains four words. The set of cache lines pointed to by the set index are set associative.

What does L1 L2 and L3 cache mean?

Cache is graded as Level 1 (L1), Level 2 (L2) and Level 3 (L3): L1 is usually part of the CPU chip itself and is both the smallest and the fastest to access. Its size is often restricted to between 8 KB and 64 KB. L2 and L3 caches are bigger than L1. They are extra caches built between the CPU and the RAM.

How are cache lines aligned?

Typically a cache line is 32 bytes long and it is aligned to a 32 byte offset. First a block of memory, a memory line, is loaded into a cache line. This cost is a cache miss, the latency of memory. Then, after loading, bytes within a cache line can be referenced without penalty as long as it remains in the cache.


2 Answers

But what tells me that arr[4] is not at an address that will cause a cache line load instead of arr[7] ?

int arrays are usually aligned on 4 byte borders (assuming int is 32 bits and byte 8 bits), so you won't know where the cache line border will be.

The lesson to learn is that you shouldn't worry about the occasional cached line being wasted (that is using 2 cache lines even though the data you need is less than 32 bytes), because that is mostly out of your hands when coding in C.

What you could worry about, if you are having performance problems, is choosing algorithms that reduces cache misses.

The typical example is loops:

int array[N][M];  // Assume N * M * sizeof (int) is much larger than the cache.

// Example 1
for (i=0; i<N; i++) {
  for (j=0; j<M; j++) {
    <do something with array[i][j]>
  }
}

// Example 2
int array[N][M];
for (j=0; j<M; j++) {
  for (i=0; i<N; i++) {
    <do something with array[i][j]>
  }
}

One of the examples will give around 8 times as many cache misses as the other because it accesses the elements in the wrong order.

like image 83
Klas Lindbäck Avatar answered Sep 29 '22 14:09

Klas Lindbäck


As far as your main question is concerned, yes, you are correct in both cases.

In the second case, where arr[7] is loaded and might want to continue upwards, you should mind that probably either the compiler or some prefetching mechanism takes into account the spatial locality of this kind of data, thus improving the performance.

Going further, indeed reading some other address in the array could possibly cause a cache line load instead of arr[7] if the array isn't properly aligned in memory, but in this case alignment is not up to you, but up to compiler.

like image 34
chrk Avatar answered Sep 29 '22 13:09

chrk