Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can NumPy take care that an array is (nonstrictly) increasing along one axis?

Is there a function in numpy to guarantee or rather fix an array such that it is (nonstrictly) increasing along one particular axis? For example, I have the following 2D array:

X = array([[1, 2, 1, 4, 5],
           [0, 3, 1, 5, 4]])

the output of np.foobar(X) should return

array([[1, 2, 2, 4, 5],
       [0, 3, 3, 5, 5]])

Does foobar exist or do I need to do that manually by using something like np.diff and some smart indexing?

like image 682
SmCaterpillar Avatar asked Dec 13 '22 19:12

SmCaterpillar


2 Answers

Use np.maximum.accumulate for a running (accumulated) max value along that axis to ensure the strictly increasing criteria -

np.maximum.accumulate(X,axis=1)

Sample run -

In [233]: X
Out[233]: 
array([[1, 2, 1, 4, 5],
       [0, 3, 1, 5, 4]])

In [234]: np.maximum.accumulate(X,axis=1)
Out[234]: 
array([[1, 2, 2, 4, 5],
       [0, 3, 3, 5, 5]])

For memory efficiency, we can assign it back to the input for in-situ changes with its out argument.

Runtime tests

Case #1 : Array as input

In [254]: X = np.random.rand(1000,1000)

In [255]: %timeit np.maximum.accumulate(X,axis=1)
1000 loops, best of 3: 1.69 ms per loop

# @cᴏʟᴅsᴘᴇᴇᴅ's pandas soln using df.cummax
In [256]: %timeit pd.DataFrame(X).cummax(axis=1).values
100 loops, best of 3: 4.81 ms per loop

Case #2 : Dataframe as input

In [257]: df = pd.DataFrame(np.random.rand(1000,1000))

In [258]: %timeit np.maximum.accumulate(df.values,axis=1)
1000 loops, best of 3: 1.68 ms per loop

# @cᴏʟᴅsᴘᴇᴇᴅ's pandas soln using df.cummax
In [259]: %timeit df.cummax(axis=1)
100 loops, best of 3: 4.68 ms per loop
like image 135
Divakar Avatar answered May 13 '23 12:05

Divakar


pandas offers you the df.cummax function:

import pandas as pd
pd.DataFrame(X).cummax(axis=1).values

array([[1, 2, 2, 4, 5],
       [0, 3, 3, 5, 5]])

It's useful to know that there's a first class function on hand in case your data is already loaded into a dataframe.

like image 37
cs95 Avatar answered May 13 '23 13:05

cs95