I have two versions of searching through an int array for a specific value.
The first version is the straight forward one
int FindWithoutBlock(int * Arr, int ArrLen, int Val)
{
for ( int i = 0; i < ArrLen; i++ )
if ( Arr[i] == Val )
return i;
return ArrLen;
}
The second version should be faster. The passed array needs to be one element larger than in the previous case. Say for an array with 5 values, you allocate six ints and then do the following
int FindWithBlock(int * Arr, int LastCellIndex, int Val)
{
Arr[LastCellIndex] = Val;
int i;
for ( i = 0 ; Arr[i] != Val; i++ );
return i;
}
this version should be faster - you don't need to check array boundaries with each iteration through Arr.
Now the "issue". When running these functions 100K times on an array of 100K elements in Debug, the second version is roughly 2x faster. In Release however, the first version is approximately 6000x faster. And the question is why.
A program that demonstrates this is to be found at http://eubie.sweb.cz/main.cpp
Any insight is much appreciated. Daniel
An array is faster than a list in python since all the elements stored in an array are homogeneous i.e., they have the same data type whereas a list contains heterogeneous elements. Moreover, Python arrays are implemented in C which makes it a lot faster than lists that are built-in in Python itself.
Short answer: In . NET List<T> and Array<T> have the same speed/performance because in . NET List is wrapper around Array .
Better use of Memory: From a memory allocation point of view, linked lists are more efficient than arrays. Unlike arrays, the size for a linked list is not pre-defined, allowing the linked list to increase or decrease in size as the program runs.
Arrays can store data very compactly and are more efficient for storing large amounts of data. Arrays are great for numerical operations; lists cannot directly handle math operations. For example, you can divide each element of an array by the same number with just one line of code.
Here are my results using DevStudio 2005:
Debug:
Release:
It is very important to run this from the command line and not from within DevStudio, DevStudio does something to affect the performance of the app.
The only way to know what's really happening is to look at the assembler code. Here's the assembler generated in release:-
FindWithoutBlock:
00401000 xor eax,eax
00401002 cmp dword ptr [ecx+eax*4],0F4240h
00401009 je FindWithoutBlock+1Ah (40101Ah)
0040100B add eax,1
0040100E cmp eax,186A0h
00401013 jl FindWithoutBlock+2 (401002h)
00401015 mov eax,186A0h
0040101A ret
Note that the compiler has removed the ArrLen parameter and replaced it with a constant! It has also kept it as a function.
Here's what the compiler did with the other function (FindWithBlock):-
004010E0 mov dword ptr [esp+38h],186A0h
004010E8 mov ebx,0F4240h
004010ED mov dword ptr [esi+61A80h],ebx
004010F3 xor eax,eax
004010F5 cmp dword ptr [esi],ebx
004010F7 je main+0EFh (40110Fh)
004010F9 lea esp,[esp]
00401100 add eax,1
00401103 cmp dword ptr [esi+eax*4],ebx
00401106 jne main+0E0h (401100h)
00401108 cmp eax,186A0h
0040110D je main+0F5h (401115h)
0040110F call dword ptr [__imp__getchar (4020D0h)]
00401115 sub dword ptr [esp+38h],1
0040111A jne main+0CDh (4010EDh)
Here, the function has been in-lined. The lea esp,[esp]
is just a 7 byte nop to align the next instruction. The code checks index 0 separately to all the other indices but the main loop is definately tighter than the FindWithoutBlock version.
Hmmm. Here's the code that calls FindWithoutBlock:-
0040106F mov ecx,edi
00401071 mov ebx,eax
00401073 call FindWithoutBlock (401000h)
00401078 mov ebp,eax
0040107A mov edi,186A0h
0040107F cmp ebp,186A0h
00401085 je main+6Dh (40108Dh)
00401087 call dword ptr [__imp__getchar (4020D0h)]
0040108D sub edi,1
00401090 jne main+5Fh (40107Fh)
Aha! The FindWitoutBlock function is only being called once! The compiler has spotted that the function will return the same value every time and has optimised it to a single call. In the FindWithBlock, the compiler can't make the same assumption because you write to the array before the search, thus the array is (potentially) different for each call.
To test this, add the volatile
keyword like this:-
int FindWithoutBlock(volatile int * Arr, int ArrLen, int Val)
{
for ( int i = 0; i < ArrLen; i++ )
if ( Arr[i] == Val )
return i;
return ArrLen;
}
int FindWithBlock(volatile int * Arr, int LastCellIndex, int Val)
{
Arr[LastCellIndex] = Val;
int i;
for ( i = 0 ; Arr[i] != Val; i++ );
return i;
}
Doing this, both versions run in similar time (6.040). Seeing as the memory access is a major bottleneck, the more complex tests of the FindWithoutBlock don't impact on the overall speed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With