I need to generate a column that starts with an initial value, and then is generated by a function that includes past values of that column. For example
df = pd.DataFrame({'a': [1,1,5,2,7,8,16,16,16]})
df['b'] = 0
df.ix[0, 'b'] = 1
df
a b
0 1 1
1 1 0
2 5 0
3 2 0
4 7 0
5 8 0
6 16 0
7 16 0
8 16 0
Now, I want to generate the rest of the column 'b' by taking the minimum of the previous row and adding two. One solution would be
for i in range(1, len(df)):
df.ix[i, 'b'] = df.ix[i-1, :].min() + 2
Resulting in the desired output
a b
0 1 1
1 1 3
2 5 3
3 2 5
4 7 4
5 8 6
6 16 8
7 16 10
8 16 12
Does pandas have a 'clean' way to do this? Preferably one that would vectorize the computation?
pandas
doesn't have a great way to handle general recursive calculations. There may be some trick to vectorize it, but if you can take the dependency, this is relatively painless and very fast with numba
.
@numba.njit
def make_b(a):
b = np.zeros_like(a)
b[0] = 1
for i in range(1, len(a)):
b[i] = min(b[i-1], a[i-1]) + 2
return b
df['b'] = make_b(df['a'].values)
df
Out[73]:
a b
0 1 1
1 1 3
2 5 3
3 2 5
4 7 4
5 8 6
6 16 8
7 16 10
8 16 12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With