Pandas: replace numpy.nan cell with maximum of non-nan adjacent cells

Question

test case:

df = pd.DataFrame([[np.nan, 2, np.nan, 0],
                    [3, 4, np.nan, 1],
                    [np.nan, np.nan, np.nan, 5],
                    [np.nan, 3, np.nan, 4]],
                    columns=list('ABCD'))

where A[i + 1, j], A[i - 1, j], A[i, j + 1], A[i, j - 1] are the set of entries adjacent to A[i,j].

In so many words, this:

     A    B   C  D
0  NaN  2.0 NaN  0
1  3.0  4.0 NaN  1
2  NaN  NaN NaN  5
3  NaN  3.0 NaN  4

should become this:

     A    B   C  D
0  3.0  2.0 2.0  0.0
1  3.0  4.0 4.0  1.0
2  3.0  4.0 5.0  5.0
3  3.0  3.0 4.0  4.0

Ted Petrou · Accepted Answer

You can use the rolling method over both directions and then find the max of each. Then you can use that to fill in the missing values of the original.

df1 = df.rolling(3, center=True, min_periods=1).max().fillna(-np.inf)
df2 = df.T.rolling(3, center=True, min_periods=1).max().T.fillna(-np.inf)
fill = df1.where(df1 > df2).fillna(df2)
df.fillna(fill)

Output

     A    B    C  D
0  3.0  2.0  2.0  0
1  3.0  4.0  4.0  1
2  3.0  4.0  5.0  5
3  3.0  3.0  4.0  4

Pandas: replace numpy.nan cell with maximum of non-nan adjacent cells

Tags:

python

pandas

user189035

1 Answers

Ted Petrou

Recent Activity

Donate For Us

Pandas: replace numpy.nan cell with maximum of non-nan adjacent cells

Tags:

python

pandas

user189035

1 Answers

Ted Petrou

Related questions

Recent Activity

Donate For Us