Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiency: arrays vs pointers

Memory access through pointers is said to be more efficient than memory access through an array. I am learning C and the above is stated in K&R. Specifically they say

Any operation that can be achieved by array subscripting can also be done with pointers. The pointer version will in general be faster

I dis-assembled the following code using visual C++.(Mine is a 686 processor. I have disabled all optimizations.)

int a[10], *p = a, temp;  void foo() {     temp = a[0];     temp = *p; } 

To my surprise I see that memory access through a pointer takes 3 instructions to the two taken by memory access through an array. Below is the corresponding code.

; 5    : temp = a[0];      mov eax, DWORD PTR _a     mov DWORD PTR _temp, eax  ; 6    : temp = *p;      mov eax, DWORD PTR _p     mov ecx, DWORD PTR [eax]     mov DWORD PTR _temp, ecx 

Please help me understand. What am I missing here??


As pointed out by many answers and comments I had used a compile time constant as the array index thus making it arguably easier for the access through an array. Below is the assembly code with a variable as the index. I now have equal number of instructions for access through pointer and arrays. My broader questions still holds good. The memory access through a pointer is not lending itself as being more efficient.

; 7    :        temp = a[i];      mov eax, DWORD PTR _i     mov ecx, DWORD PTR _a[eax*4]     mov DWORD PTR _temp, ecx  ; 8    :  ; 9    :     ; 10   :        temp = *p;      mov eax, DWORD PTR _p     mov ecx, DWORD PTR [eax]     mov DWORD PTR _temp, ecx 
like image 732
Abhijith Madhav Avatar asked Feb 21 '10 11:02

Abhijith Madhav


People also ask

Why are pointer arrays more efficient than arrays?

Pointers are much more efficient in cases of large arrays passed as parameters to functions. It is actually a difficult feat to pass a primitive array to a function by value (copy).

Which is better to use pointer or array?

Array in C is used to store elements of same types whereas Pointers are address varibles which stores the address of a variable. Now array variable is also having a address which can be pointed by a pointer and array can be navigated using pointer.

Are pointers more efficient?

Faster and more efficient code can be written because pointers are closer to the hardware. That is, the compiler can more easily translate the operation into machine code. There is not as much overhead associated with pointers as might be present with other operators.

Which is faster array or pointer?

Originally Answered: why is pointer indexing faster than array indexing? It's straight forward that array will always will be faster. Because memory allocation of array is continuous. So accessing array is much faster compare to pointer where memory allocation might or might not be continuous.


2 Answers

Memory access through pointers is said to be more efficient than memory access through an array.

That may have been true in the past when compilers were relatively stupid beasts. You only need to look at some of the code output by gcc in high optimisation modes to know that it is no longer true. Some of that code is very hard to understand but, once you do, its brilliance is evident.

A decent compiler will generate the same code for pointer accesses and array accesses and you should probably not be worrying about that level of performance. The people that write compilers know far more about their target architectures than we mere mortals. Concentrate more on the macro level when optimising your code (algorithm selection and so on) and trust in your tool-makers to do their job.


In fact, I'm surprised the compiler didn't optimise the entire

temp = a[0]; 

line out of existence since temp is over-written in the very next line with a different value and a is in no way marked volatile.

I remember an urban myth from long ago about a benchmark for the latest VAX Fortran compiler (showing my age here) that outperformed its competitors by several orders of magnitude.

Turns out the compiler figured out that the result from the benchmark calculation wasn't used anywhere so it optimised the entire calculation loop into oblivion. Hence the substantial improvement in run speed.


Update: The reason that optimised code is more efficient in your particular case is because of the way you find the location. a will be at a fixed location decided at link/load time and the reference to it will be fixed up at the same time. So a[0] or indeed a[any constant] will be at a fixed location.

And p itself will also be at a fixed location for the same reason. But *p (the contents of p) is variable and therefore will have an extra lookup involved to find the correct memory location.

You'll probably find that having yet another variable x set to 0 (not const) and using a[x] would also introduce extra calculations.


In one of your comments, you state:

Doing as you suggested resulted in 3 instructions for memory access through arrays too (fetch index, fetch value of array element, store in temp). But I am still unable to see the efficiency. :-(

My response to that is that you very likely won't see an efficiency in using pointers. Modern compilers are more than up to the task of figuring out that array operations and pointer operations can be turned into the same underlying machine code.

In fact, without optimisation turned on, pointer code can be less efficient. Consider the following translations:

int *pa, i, a[10];  for (i = 0; i < 10; i++)     a[i] = 100; /*     movl    $0, -16(%ebp)              ; this is i, init to 0 L2:     cmpl    $9, -16(%ebp)              ; from 0 to 9     jg      L3     movl    -16(%ebp), %eax            ; load i into register     movl    $100, -72(%ebp,%eax,4)     ; store 100 based on array/i     leal    -16(%ebp), %eax            ; get address of i     incl    (%eax)                     ; increment     jmp     L2                         ; and loop L3: */  for (pa = a; pa < a + 10; pa++)     *pa = 100; /*     leal    -72(%ebp), %eax     movl    %eax, -12(%ebp)            ; this is pa, init to &a[0] L5:     leal    -72(%ebp), %eax     addl    $40, %eax     cmpl    -12(%ebp), %eax            ; is pa at &(a[10])     jbe     L6                         ; yes, stop     movl    -12(%ebp), %eax            ; get pa     movl    $100, (%eax)               ; store 100     leal    -12(%ebp), %eax            ; get pa     addl    $4, (%eax)                 ; add 4 (sizeof int)     jmp     L5                         ; loop around L6: */ 

From that example, you can actually see that the pointer example is longer, and unnecessarily so. It loads pa into %eax multiple times without it changing and indeed alternates %eax between pa and &(a[10]). The default optimisation here is basically none at all.

When you switch up to optimisation level 2, the code you get is:

    xorl    %eax, %eax L5:     movl    $100, %edx     movl    %edx, -56(%ebp,%eax,4)     incl    %eax     cmpl    $9, %eax     jle     L5 

for the array version, and:

    leal    -56(%ebp), %eax     leal    -16(%ebp), %edx     jmp     L14 L16:     movl    $100, (%eax)     addl    $4, %eax L14:     cmpl    %eax, %edx     ja      L16 

for the pointer version.

I'm not going to do an analysis on clock cycles here (since it's too much work and I'm basically lazy) but I will point out one thing. There's not a huge difference in the code for both versions in terms of assembler instructions and, given the speeds that modern CPUs actually run at, you won't notice a difference unless you're doing billions of these operations. I always tend to prefer writing code for readability and only worrying about performance if it becomes an issue.

As an aside, that statement you reference:

5.3 Pointers and Arrays: The pointer version will in general be faster but, at least to the uninitiated, somewhat harder to grasp immediately.

dates back to the earliest versions of K&R, including my ancient 1978 one where functions are still written:

getint(pn) int *pn; {     ... } 

Compilers have come an awfully long way since back then.

like image 139
paxdiablo Avatar answered Sep 19 '22 18:09

paxdiablo


If you're programming embedded platforms, you quickly learn that the pointer method is a lot faster than using an index.

struct bar a[10], *p;  void foo() {     int i;      // slow loop     for (i = 0; i < 10; ++i)         printf( a[i].value);      // faster loop     for (p = a; p < &a[10]; ++p)         printf( p->value); } 

The slow loop has to calculate a + (i * sizeof(struct bar)) each time through, whereas the second just has to add sizeof(struct bar) to p each time through. The multiply operation uses more clock cycles than the add on many processors.

You really start to see improvements if you reference a[i] multiple times inside the loop. Some compilers don't cache that address, so it may be recalculated multiple times inside the loop.

Try updating your sample to use a struct and reference multiple elements.

like image 24
tomlogic Avatar answered Sep 20 '22 18:09

tomlogic