I have a DataFrame:
df = pd.DataFrame(data=[676, 0, 670, 0, 668], index=['2012-01-31 00:00:00','2012-02-29 00:00:00',
'2012-03-31 00:00:00','2012-04-30 00:00:00',
'2012-05-31 00:00:00'])
df.index.name = "Date"
df.columns = ["Number"]
Which looks like:
Number
Date
2012-01-31 00:00:00 676
2012-02-29 00:00:00 0
2012-03-31 00:00:00 670
2012-04-30 00:00:00 0
2012-05-31 00:00:00 668
How can i input 2nd and 4th values with (676+670)/2 and (670+668)/2 correspondinly?
I can save values as np.array
and imput them in array, but that's rediculous!
I use where
method and specify to replace any 0
with np.nan
. Once we have specified 0
to be NaN
we can use fillna
method. By using ffill
and bfill
we fill all NaN
with the corresponding previous and proceeding values, add them, and divide by 2.
df.where(df.replace(to_replace=0, value=np.nan),
other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)
Number
Date
2012-01-31 00:00:00 676.0
2012-02-29 00:00:00 673.0
2012-03-31 00:00:00 670.0
2012-04-30 00:00:00 669.0
2012-05-31 00:00:00 668.0
#use apply to fill the Number with average from surrounding rows.
df['Number'] = df.reset_index().apply(lambda x: df.reset_index()\
.iloc[[x.name-1,x.name+1]]['Number'].mean() \
if (x.name>0) & (x.Number==0) else x.Number,axis=1).values
df
Out[1440]:
Number
Date
2012-01-31 00:00:00 676.0
2012-02-29 00:00:00 673.0
2012-03-31 00:00:00 670.0
2012-04-30 00:00:00 669.0
2012-05-31 00:00:00 668.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With