Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do Static Variables Impede Data Caching?

From Optimizing Software in C++ (Section 7.1),

The advantage of static data is that it can be initialized to desired values before the program starts. The disadvantage is that the memory space is occupied throughout the whole program execution, even if the variable is only used in a small part of the program. This makes data caching less efficient.

The usage of static in this except is as it applies to both C and C++ in the exact case of static storage duration.

Can anyone shed some light on why (or whether) data caching is less efficient for static duration variables? Here is a specific comparison:

void foo() {
  static int static_arr[] = {/**/};
}
void bar() {
  int local_arr[] = {/**/};
}

I don't see any reason why static data would cache differently than any other kind of data. In the given example, I would think that foo will be faster because the execution stack doesn't have to load static_arr, whereas in bar, the execution stack has to load local_arr. In either case, if these functions were called repeatedly, both static_arr and local_arr will be cached. Am I wrong?

like image 360
okovko Avatar asked Dec 11 '22 03:12

okovko


2 Answers

In general, Agner Fog usually knows what he is talking about.

If we read the quote in the context of section 7.1 Different kinds of variable storage, we see what he means by "less efficient caching" in the beginning of the section:

Data caching is poor if data are scattered randomly around in the memory. It is therefore important to understand how variables are stored. The storage principles are the same for simple variables, arrays and objects.

So the idea behind saying that static variables are less cache-efficient is that the chance that the memory location where they are stored is "cold" (no longer in cache) is greater than with stack memory, which is where the variable with automatic storage duration would be stored.

With caching and paging in mind, it's the combination of physical and temporal locality of data storage that affects performance.

like image 193
rustyx Avatar answered Dec 12 '22 16:12

rustyx


The answer from rustyx explains it. Local variables are stored on the stack. The stack space is released when a function returns and reused when the next function is called. Caching is more efficent for local variables because the same memory space is reused again and again, while static variables are scattered around at different memory addresses that can never be reused for another purpose. Whether static data are stored in the DATA section (initialized) or the BSS section (uninitalized) makes no difference in this respect. The top-of-stack space is likely to stay cached throughout program execution and be reused many times.

Another advantage is that a limited number of local variables can be accessed with an 8-bit offset relative to the stack pointer, while static variables need a 32-bit absolute address (in 32-bit x86) or a 32-bit relative address (in x86-64). In other words, local variables may make the code more compact and improve utilization of the code cache as well as the data cache.

// Example
int main () {
  f();
  g();
  return 0;
}

void f() {
   int x; 
   ...
}

void g() {
   int y;  // y may occupy the same memory address as x
   ...
}
like image 36
A Fog Avatar answered Dec 12 '22 15:12

A Fog