Input missed values with mean of nearest neighbors in column

Question

I have a DataFrame:

df = pd.DataFrame(data=[676, 0, 670, 0, 668], index=['2012-01-31 00:00:00','2012-02-29 00:00:00',
                                                     '2012-03-31 00:00:00','2012-04-30 00:00:00',
                                                     '2012-05-31 00:00:00'])  
df.index.name = "Date"
df.columns = ["Number"]

Which looks like:

              Number
Date    
2012-01-31 00:00:00 676
2012-02-29 00:00:00 0
2012-03-31 00:00:00 670
2012-04-30 00:00:00 0
2012-05-31 00:00:00 668

How can i input 2nd and 4th values with (676+670)/2 and (670+668)/2 correspondinly?

I can save values as np.array and imput them in array, but that's rediculous!

spies006 · Accepted Answer

I use where method and specify to replace any 0 with np.nan. Once we have specified 0 to be NaN we can use fillna method. By using ffill and bfill we fill all NaN with the corresponding previous and proceeding values, add them, and divide by 2.

df.where(df.replace(to_replace=0, value=np.nan),
 other=(df.fillna(method='ffill') + df.fillna(method='bfill'))/2)

                     Number
Date                       
2012-01-31 00:00:00   676.0
2012-02-29 00:00:00   673.0
2012-03-31 00:00:00   670.0
2012-04-30 00:00:00   669.0
2012-05-31 00:00:00   668.0

Allen · Answer

#use apply to fill the Number with average from surrounding rows.
df['Number'] = df.reset_index().apply(lambda x: df.reset_index()\
                               .iloc[[x.name-1,x.name+1]]['Number'].mean() \
                               if (x.name>0) & (x.Number==0) else x.Number,axis=1).values

df
Out[1440]: 
                     Number
Date                       
2012-01-31 00:00:00   676.0
2012-02-29 00:00:00   673.0
2012-03-31 00:00:00   670.0
2012-04-30 00:00:00   669.0
2012-05-31 00:00:00   668.0

Input missed values with mean of nearest neighbors in column

Tags:

python

pandas

dataframe

Ladenkov Vladislav

2 Answers

spies006

Allen

Recent Activity

Donate For Us

Input missed values with mean of nearest neighbors in column

Tags:

python

pandas

dataframe

Ladenkov Vladislav

2 Answers

spies006

Allen

Related questions

Recent Activity

Donate For Us