I have so far the following code provided by EdChum:
In [1]:
df = pd.DataFrame({'a': [None] * 6, 'b': [2, 3, 10, 3, 5, 8]})
df["c"] =np.NaN
df["c"][0] = 1
df["c"][2] = 3
def func(x):
if pd.notnull(x['c']):
return x['c']
else:
return df.iloc[x.name - 1]['c'] * x['b']
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df
Out[1]:
a b c
0 None 2 1
1 None 3 3
2 None 10 3
3 None 3 9
4 None 5 45
5 None 8 360
That also works perfectly well, but as soon as I change the index of the dateframe = df as follows:
rng = pd.date_range('1/1/2011', periods=6, freq='D')
df = pd.DataFrame({'a': [None] * 6, 'b': [2, 3, 10, 3, 5, 8]},index=rng)
I get the following error: TypeError: ("cannot do label indexing on <class 'pandas.tseries.index.DatetimeIndex'> with these indexers [2011-01-01 00:00:00] of <class 'pandas.tslib.Timestamp'>", u'occurred at index 2011-01-02 00:00:00')
What is the problem here? How do I need to adjust the code in order to make it work with da DatetimeIndex?
The following works, the difference here is that I get the integer location of the datetime value in the index using get_loc
:
In [48]:
rng = pd.date_range('1/1/2011', periods=6, freq='D')
df = pd.DataFrame({'a': [None] * 6, 'b': [2, 3, 10, 3, 5, 8]},index=rng)
df["c"] =np.NaN
df["c"][0] = 1
df["c"][2] = 3
def func(x):
if pd.notnull(x['c']):
return x['c']
else:
return df.iloc[df.index.get_loc(x.name) - 1]['c'] * x['b']
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df
Out[48]:
a b c
2011-01-01 None 2 1
2011-01-02 None 3 3
2011-01-03 None 10 3
2011-01-04 None 3 9
2011-01-05 None 5 45
2011-01-06 None 8 360
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With