Array vs Slice: accessing speed

Tags:

This question is about the speed of accessing elements of arrays and slices, not about the efficiency of passing them to functions as arguments.

I would expect arrays to be faster than slices in most cases because a slice is a data structure describing a contiguous section of an array and so there may be an extra step involved when accessing elements of a slice (indirectly the elements of its underlying array).

So I wrote a little test to benchmark a batch of simple operations. There are 4 benchmark functions, the first 2 test a global slice and a global array, the other 2 test a local slice and a local array:

Click to copy

var gs = make([]byte, 1000) // Global slice
var ga [1000]byte           // Global array

func BenchmarkSliceGlobal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        for j, v := range gs {
            gs[j]++; gs[j] = gs[j] + v + 10; gs[j] += v
        }
    }
}

func BenchmarkArrayGlobal(b *testing.B) {
    for i := 0; i < b.N; i++ {
        for j, v := range ga {
            ga[j]++; ga[j] = ga[j] + v + 10; ga[j] += v
        }
    }
}

func BenchmarkSliceLocal(b *testing.B) {
    var s = make([]byte, 1000)
    for i := 0; i < b.N; i++ {
        for j, v := range s {
            s[j]++; s[j] = s[j] + v + 10; s[j] += v
        }
    }
}

func BenchmarkArrayLocal(b *testing.B) {
    var a [1000]byte
    for i := 0; i < b.N; i++ {
        for j, v := range a {
            a[j]++; a[j] = a[j] + v + 10; a[j] += v
        }
    }
}

I ran the test multiple times, here is the typical output (go test -bench .*):

Click to copy

BenchmarkSliceGlobal      300000              4210 ns/op
BenchmarkArrayGlobal      300000              4123 ns/op
BenchmarkSliceLocal       500000              3090 ns/op
BenchmarkArrayLocal       500000              3768 ns/op

Analyzing the results:

Accessing the global slice is slightly slower than accessing the global array which is as I expected:
4210 vs 4123 ns/op

But accessing the local slice is significantly faster than accessing the local array:
3090 vs 3768 ns/op

My question is: What is the reason for this?

Notes

I tried varying the following things but none changed the outcome:

the size of the array/slice (tried 100, 1000, 10000)
the order of the benchmark functions
the element type of the array/slice (tried byte and int)

653

asked May 29 '15 08:05

icza

1 Answers

Comparing the amd64 assembly of both BenchmarkArrayLocal and BenchmarkSliceLocal (too long to fit in this post):

The array version loads the address of a from memory multiple times, practically on every array-access operation:

Click to copy

LEAQ    "".a+1000(SP),BX

Whereas the slice version is computing exclusively on registers after loading once from memory:

Click to copy

LEAQ    (DX)(SI*1),BX

This is not conclusive but probably the cause. Reason being that both methods are otherwise virtually identical. One other notable detail is that the array version calls into runtime.duffcopy, which is a quite long assembly routine, whereas the slice version doesn't.

169

answered Sep 21 '22 13:09

thwd

Related questions
                            
                                Maximum sum of all subarrays of size k for each k=1..n
                            
                                MongoDB embedded vs array sub document performance
                            
                                Passing Numpy arrays to C code wrapped with Cython
                            
                                Java generics and array initialization
                            
                                Java generic varargs method parameters
                            
                                Access a 1D array as a 2D array in C++
                            
                                When exactly is a pointer difference defined?
                            
                                Why presize a JavaScript Array?
                            
                                Accessing Big Arrays in PHP
                            
                                How do I use Serde to (de)serialize arrays greater than 32 elements, such as [u8; 128]?
                            
                                numpy array that is (n,1) and (n,)
                            
                                Array allocation and access on the Java Virtual Machine and memory contention
                            
                                Why does ArrayList use Object[] (instead of E[]) internally? [duplicate]
                            
                                What is a list in Bash?
                            
                                How to handle empty arrays in Firebase?
                            
                                Smoothing out values of an array
                            
                                Append a 1d array to a 2d array in Numpy Python
                            
                                Performance tips for finding unique permutation
                            
                                Initializing List in constructor or field declaration
                            
                                Creating same random number sequence in Python, NumPy and R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Array vs Slice: accessing speed

Tags:

performance

arrays

slice

benchmarking

go

icza

People also ask

1 Answers

thwd

Recent Activity

Donate For Us