Shift elements in a numpy array

Tags:

python

numpy

This question contains its own answer at the bottom. Use preallocated arrays.

Following-up from this question years ago, is there a canonical "shift" function in numpy? I don't see anything from the documentation.

Here's a simple version of what I'm looking for:

def shift(xs, n):
    if n >= 0:
        return np.r_[np.full(n, np.nan), xs[:-n]]
    else:
        return np.r_[xs[-n:], np.full(-n, np.nan)]

Using this is like:

In [76]: xs
Out[76]: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

In [77]: shift(xs, 3)
Out[77]: array([ nan,  nan,  nan,   0.,   1.,   2.,   3.,   4.,   5.,   6.])

In [78]: shift(xs, -3)
Out[78]: array([  3.,   4.,   5.,   6.,   7.,   8.,   9.,  nan,  nan,  nan])

_{This question came from my attempt to write a fast rolling_product yesterday. I needed a way to "shift" a cumulative product and all I could think of was to replicate the logic in np.roll().}

So np.concatenate() is much faster than np.r_[]. This version of the function performs a lot better:

def shift(xs, n):
    if n >= 0:
        return np.concatenate((np.full(n, np.nan), xs[:-n]))
    else:
        return np.concatenate((xs[-n:], np.full(-n, np.nan)))

An even faster version simply pre-allocates the array:

def shift(xs, n):
    e = np.empty_like(xs)
    if n >= 0:
        e[:n] = np.nan
        e[n:] = xs[:-n]
    else:
        e[n:] = np.nan
        e[:n] = xs[-n:]
    return e

The above proposal is the answer. Use preallocated arrays.

794

asked May 22 '15 14:05

chrisaycock

2 Answers

Not numpy but scipy provides exactly the shift functionality you want,

import numpy as np
from scipy.ndimage.interpolation import shift

xs = np.array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

shift(xs, 3, cval=np.NaN)

where default is to bring in a constant value from outside the array with value cval, set here to nan. This gives the desired output,

array([ nan, nan, nan, 0., 1., 2., 3., 4., 5., 6.])

and the negative shift works similarly,

shift(xs, -3, cval=np.NaN)

Provides output

array([  3.,   4.,   5.,   6.,   7.,   8.,   9.,  nan,  nan,  nan])

149

answered Oct 17 '22 07:10

Ed Smith

For those who want to just copy and paste the fastest implementation of shift, there is a benchmark and conclusion(see the end). In addition, I introduce fill_value parameter and fix some bugs.

Benchmark

import numpy as np
import timeit

# enhanced from IronManMark20 version
def shift1(arr, num, fill_value=np.nan):
    arr = np.roll(arr,num)
    if num < 0:
        arr[num:] = fill_value
    elif num > 0:
        arr[:num] = fill_value
    return arr

# use np.roll and np.put by IronManMark20
def shift2(arr,num):
    arr=np.roll(arr,num)
    if num<0:
         np.put(arr,range(len(arr)+num,len(arr)),np.nan)
    elif num > 0:
         np.put(arr,range(num),np.nan)
    return arr

# use np.pad and slice by me.
def shift3(arr, num, fill_value=np.nan):
    l = len(arr)
    if num < 0:
        arr = np.pad(arr, (0, abs(num)), mode='constant', constant_values=(fill_value,))[:-num]
    elif num > 0:
        arr = np.pad(arr, (num, 0), mode='constant', constant_values=(fill_value,))[:-num]

    return arr

# use np.concatenate and np.full by chrisaycock
def shift4(arr, num, fill_value=np.nan):
    if num >= 0:
        return np.concatenate((np.full(num, fill_value), arr[:-num]))
    else:
        return np.concatenate((arr[-num:], np.full(-num, fill_value)))

# preallocate empty array and assign slice by chrisaycock
def shift5(arr, num, fill_value=np.nan):
    result = np.empty_like(arr)
    if num > 0:
        result[:num] = fill_value
        result[num:] = arr[:-num]
    elif num < 0:
        result[num:] = fill_value
        result[:num] = arr[-num:]
    else:
        result[:] = arr
    return result

arr = np.arange(2000).astype(float)

def benchmark_shift1():
    shift1(arr, 3)

def benchmark_shift2():
    shift2(arr, 3)

def benchmark_shift3():
    shift3(arr, 3)

def benchmark_shift4():
    shift4(arr, 3)

def benchmark_shift5():
    shift5(arr, 3)

benchmark_set = ['benchmark_shift1', 'benchmark_shift2', 'benchmark_shift3', 'benchmark_shift4', 'benchmark_shift5']

for x in benchmark_set:
    number = 10000
    t = timeit.timeit('%s()' % x, 'from __main__ import %s' % x, number=number)
    print '%s time: %f' % (x, t)

benchmark result:

benchmark_shift1 time: 0.265238
benchmark_shift2 time: 0.285175
benchmark_shift3 time: 0.473890
benchmark_shift4 time: 0.099049
benchmark_shift5 time: 0.052836

Conclusion

shift5 is winner! It's OP's third solution.

110

answered Oct 17 '22 07:10

gzc

Related questions
                            
                                Replace invalid values with None in Pandas DataFrame
                            
                                memory-efficient built-in SqlAlchemy iterator/generator?
                            
                                Access index of last element in data frame
                            
                                Generic catch for python
                            
                                Convert Python ElementTree to string
                            
                                How to run a script forever? [duplicate]
                            
                                Find the greatest number in a list of numbers
                            
                                How to deal with certificates using Selenium?
                            
                                imread returns None, violating assertion !_src.empty() in function 'cvtColor' error
                            
                                search by ObjectId in mongodb with pymongo
                            
                                deleting rows in numpy array
                            
                                Logistic regression python solvers' definitions
                            
                                Logging to two files with different settings
                            
                                How to install Python packages from the tar.gz file without using pip install
                            
                                H14 error in heroku - "no web processes running"
                            
                                Handle JSON Decode Error when nothing returned
                            
                                How to check type of files without extensions? [duplicate]
                            
                                Test if all elements of a python list are False
                            
                                `ipython` tab autocomplete does not work on imported module
                            
                                save a pandas.Series histogram plot to file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With