I would like to use the fillna function to fill None value of a column with its own first most frequent value that is not None or nan.
Input DF:
Col_A
a
None
None
c
c
d
d
The output Dataframe could be either:
Col_A
a
c
c
c
c
d
d
Any suggestion would be very appreciated. Many Thanks, Best Regards, Carlo
Prelude: If your None is actually a string, you can simplify any headaches by getting rid of them first-up. Use replace:
df = df.replace('None', np.nan)
I believe you could use fillna + value_counts:
df
Col_A
0 a
1 NaN
2 NaN
3 c
4 c
5 d
6 d
df.fillna(df.Col_A.value_counts(sort=False).index[0])
Col_A
0 a
1 c
2 c
3 c
4 c
5 d
6 d
Or, with Vaishali's suggestion, use idxmax to pick c:
df.fillna(df.Col_A.value_counts(sort=False).idxmax())
Col_A
0 a
1 c
2 c
3 c
4 c
5 d
6 d
The fill-values could either be c or d, depending on whether you include sort=False or not.
Details
df.Col_A.value_counts(sort=False)
c 2
a 1
d 2
Name: Col_A, dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With