Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is numpy reusing memory from unused arrays?

While implementing some Gauss-Seidel solver with Python and Numpy I discovered an interesting side effect. I tried to extract some minimal example:


    number = 1000
    startup = 'import numpy as np;N = 2048;X = np.arange(N * N).reshape((N, N));'
    startup2 = startup + 'np.empty_like(X)'


    example = ('X[1:: 2, : -1: 2] = ('
               'X[: -1: 2, 1:: 2] +'
               'X[1:: 2, : -1: 2] +'
               'X[1:: 2, : -1: 2] +'
               'X[: -1: 2, : -1: 2])/4')

running print(timeit.timeit(example, setup=startup, number=number)) takes at my machine ~5s

while print(timeit.timeit(example, setup=startup2, number=number)) takes ~4s.

So around ~1s faster, although there is this unnecessary array allocation withnp.emtpy_like(X). I observed this effect on various machines and with various array sizes or iterations.

I assume that calculation of the right hand side in the assignment leads to an temporally array allocation. It seems that Numpy somehow reuses the unused array that was created with thenp.emtpy_like(X) to speed up the temporally array allocation.

Am I right with this assumption or is something totally different the reason for the differences in the time?

If I remove the /4 to

 example = ('X[1:: 2, : -1: 2] = ('
               'X[: -1: 2, 1:: 2] +'
               'X[1:: 2, : -1: 2] +'
               'X[1:: 2, : -1: 2] +'
               'X[: -1: 2, : -1: 2])')

Then, I can not observe differences in the execution time between the different versions. So I assume that in this case the calculation can be done in-place and then there is no temporally allocation.

Is there a more explicit way to exploit this effect? Just writing np.emtpy_like(X)` looks somehow "hacky" to me.

Thanks in advance!

Updates:

Thank you for your answers and comments so far.

Finally I found some more time to mess around with my observations and I am not sure if I my explanations above are clear enough. So my initial observation was, that this

    number = 1000
    N = 1024
    X = np.arange(N * N).reshape((N, N))
    np.empty_like(X)
    for _ in range(number):
        X[1:: 2, : -1: 2] = (X[: -1: 2, 1:: 2] + X[1:: 2, : -1: 2] +
                             X[1:: 2, : -1: 2] + X[: -1: 2, : -1: 2]) / 4

was faster then this

    number = 1000
    N = 1024
    X = np.arange(N * N).reshape((N, N))
    for _ in range(number):
        X[1:: 2, : -1: 2] = (X[: -1: 2, 1:: 2] + X[1:: 2, : -1: 2] +
                             X[1:: 2, : -1: 2] + X[: -1: 2, : -1: 2]) / 4

Which I found quiet surprising, because this totally unused and unnecessary array allocation with the np.empty_like(X) seems to accelerate the loop below. Thereby, it does not matter if I use np.empty_like , zeros_like, ones_like, ones(X.shape) or X.copy() as long there is a unused array allocated. And it also occurs for different N's, number of iterations and on different machines.

Furthermore I tried to investigate the issue with the tracemalloc package:

   number = 1000
   N = 1024   
   X = np.arange(N * N).reshape((N, N))
   tracemalloc.start()
   np.empty_like(X)
   for _ in range(number):
       X[1:: 2, : -1: 2] = (X[: -1: 2, 1:: 2] + X[1:: 2, : -1: 2] +
                            X[1:: 2, : -1: 2] + X[: -1: 2, : -1: 2]) / 4
   s2 = tracemalloc.take_snapshot()

   display_top(s2)

Where display_top is the method from the Documentation, except that I print out in Bytes not in KB.

When I run it without the extra array allocation from np.empty_like(X) I get some output like this:

Top 10 lines
#1: ./minexample.py:40: 160.0 B
    X[1:: 2, : -1: 2] = (X[: -1: 2, 1:: 2] + X[1:: 2, : -1: 2] + X[1:: 2, : -1: 2] + X[: -1: 2, : -1: 2]) / 4
#2: ./minexample.py:39: 28.0 B
    for _ in range(1000):
Total allocated size: 188.0 B

With the extra allocation I get this:

Top 10 lines
#1: ./minexample.py:40: 128.0 B
    X[1:: 2, : -1: 2] = (X[: -1: 2, 1:: 2] + X[1:: 2, : -1: 2] + X[1:: 2, : -1: 2] + X[: -1: 2, : -1: 2]) / 4
#2: ./minexample.py:38: 32.0 B
    np.empty_like(X)
#3: ./minexample.py:39: 28.0 B
    for _ in range(1000):
Total allocated size: 188.0 B

So the size that is allocated of the line in the loop is smaller then when there is no unused array preallocated. So it seems to me that this unused array is reused.

Maybe this explains this effect?

like image 496
christoph Avatar asked Oct 22 '20 11:10

christoph


People also ask

Are NumPy arrays memory efficient?

1. NumPy uses much less memory to store data. The NumPy arrays takes significantly less amount of memory as compared to python lists. It also provides a mechanism of specifying the data types of the contents, which allows further optimisation of the code.

How much memory does a NumPy array use?

With a numpy array we need roughly 8 Byte per float. A linked list however requires roughly 32 Bytes per float.

Are NumPy arrays more memory efficient than lists?

As the array size increase, Numpy gets around 30 times faster than Python List. Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster.


1 Answers

empty_like(X) initializes an array (or matrix) with the same dimensions as X and fills it with random values, which is an incredibly fast process. But keep in mind that + empty_like(X) does not give you the desired result ( you would need zeros_like(X) )!

Why it's fast to add two arrays in numpy has to do with densely packed arrays and is described here: https://stackoverflow.com/a/8385658/14344821, this can potentially be faster than creating the matrix X with explicitly mentioned entries. Tips on how to create numpy arrays efficiently can be found here.

like image 108
Casper Dijkstra Avatar answered Oct 14 '22 07:10

Casper Dijkstra