I'm trying to make a version of my program faster using as much Pandas and Numpy as possible. I am new to Numpy but have been grasping most of it, but I am having trouble with conditional formatting a column with the max of a range. This is the code I am trying to use to achieve this:
x=3
df1['Max']=numpy.where(df1.index>=x,max(df1.High[-x:],0))
Basically, I am trying to conditionally put the maximum value over the last 3 entries into a cell and repeat down the column. Any and all help is appreciated.
Use Scipy's maximum_filter
-
from scipy.ndimage.filters import maximum_filter1d
df['max'] = maximum_filter1d(df.High,size=3,origin=1,mode='nearest')
Basically, maximum_filter operates in a sliding window looking for maximum in that window. Now, by default each such max
computation would be performed with window being centered at the index itself. Since, we are looking to go three elements before and ending at the current one, we need to change that centeredness with the parameter origin
. Therefore, we have it set at 1
.
Sample run -
In [21]: df
Out[21]:
High max
0 13 13
1 77 77
2 16 77
3 30 77
4 25 30
5 98 98
6 79 98
7 58 98
8 51 79
9 23 58
Runtime test
Got me interested to see how this Scipy's sliding max operation performs against Pandas's rolling max method on performance. Here's some results on big datasizes -
In [55]: df = pd.DataFrame(np.random.randint(0,99,(10000)),columns=['High'])
In [56]: %%timeit # @Merlin's rolling based solution :
...: df['max'] = df.High.rolling(window=3, min_periods=1).max()
...:
1000 loops, best of 3: 1.35 ms per loop
In [57]: %%timeit # Using Scipy's max filter :
...: df['max1'] = maximum_filter1d(df.High,size=3,\
...: origin=1,mode='nearest')
...:
1000 loops, best of 3: 487 µs per loop
Here is the logic on np.where
numpy.where('test something,if true ,if false)
I think you need below.
dd= {'to': [100, 200, 300, 400, -500, 600, 700,800, 900, 1000]}
df = pd.DataFrame(dd)
df
to
0 100
1 200
2 300
3 400
4 -500
5 600
6 700
7 800
8 900
9 1000
df['Max'] = df.rolling(window=3, min_periods=1).max()
to Max
0 100 100.0
1 200 200.0
2 300 300.0
3 400 400.0
4 -500 400.0
5 600 600.0
6 700 700.0
7 800 800.0
8 900 900.0
9 1000 1000.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With