Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any penalty to recursively slicing of a np array many times?

Tags:

python

numpy

Say I want to use a np array as a fixed read-only queue and pop off the front of it. This is a natural way to do it:

def pop(k,q):
    return q[:k],q[k:]
## example usage:
x = np.arange(1000)
for i in range(5):
    a,x = pop(i,x)
    print(a) 

This seems to be fine, but I want to be sure that there's no hidden state or hidden references that build up if q=q[k:] gets executed thousands or millions of times. There doesn't seem to be: to take a slice, np simply stores a pointer to the original data buffer, and the new index. But I want to be sure, because if there is something I'm missing, I could represent this as a tuple (np.array,index), which is not as clean:

def pop0(k,q):
    x,i = q
    return x[i:i+k],(x,i+k)
like image 414
mrip Avatar asked Nov 07 '22 06:11

mrip


1 Answers

It will be safe. You can use memory-profiler package to observe memory usage:

For following code:

import numpy as np

@profile
def good_pop():
    def pop(k,q):
        return q[:k],q[k:]
    x = np.arange(10000)
    for i in range(5000):
        a,x = pop(i,x)

@profile
def bad_pop():
    def pop(k,q):
        return q[:k].copy(),q[k:].copy()
    x = np.arange(10000)
    for i in range(5000):
        a,x = pop(i,x)

good_pop()
bad_pop()

profiling with command:

$ python3 -m memory_profiler file.py

produced result:

Line #    Mem usage    Increment   Line Contents
================================================
     3   49.023 MiB   49.023 MiB   @profile
     4                             def good_pop():
     5   49.023 MiB    0.000 MiB       def pop(k,q):
     6   49.023 MiB    0.000 MiB           return q[:k],q[k:]
     7   49.023 MiB    0.000 MiB       x = np.arange(10000)
     8   49.023 MiB    0.000 MiB       for i in range(5000):
     9   49.023 MiB    0.000 MiB           a,x = pop(i,x)


Line #    Mem usage    Increment   Line Contents
================================================
    11   49.023 MiB   49.023 MiB   @profile
    12                             def bad_pop():
    13   49.234 MiB    0.000 MiB       def pop(k,q):
    14   49.234 MiB    0.211 MiB           return q[:k].copy(),q[k:].copy()
    15   49.023 MiB    0.000 MiB       x = np.arange(10000)
    16   49.234 MiB    0.000 MiB       for i in range(5000):
    17   49.234 MiB    0.000 MiB           a,x = pop(i,x)

A bunch of zeros for good_pop() is a proof that no extra objects are created in this loop.

like image 86
tstanisl Avatar answered Nov 14 '22 21:11

tstanisl