I have dataframe i.e.,
Input Dataframe
class section sub marks school city
0 I A Eng 80 jghss salem
1 I A Mat 90 jghss salem
2 I A Eng 50 Nan salem
3 III A Eng 80 gphss Nan
4 III A Mat 45 Nan salem
5 III A Eng 40 gphss Nan
6 III A Eng 20 gphss salem
7 III A Mat 55 gphss Nan
I need to replace the "Nan" in "school" and "city" when a value in "class" and "section" column matches. The resultant outcome suppose to be, Input Dataframe
class section sub marks school city
0 I A Eng 80 jghss salem
1 I A Mat 90 jghss salem
2 I A Eng 50 jghss salem
3 III A Eng 80 gphss salem
4 III A Mat 45 gphss salem
5 III A Eng 40 gphss salem
6 III A Eng 20 gphss salem
7 III A Mat 55 gphss salem
Can anyone help me out in this?
You can replace all values or selected values in a column of pandas DataFrame based on condition by using DataFrame. loc[] , np. where() and DataFrame. mask() methods.
You can use the fillna() function to replace NaN values in a pandas DataFrame.
Use forward and back filling missing values per groups with lambda function
in columns specified in list with DataFrame.groupby
- is necessary for each combination same values per groups:
cols = ['school','city']
df[cols] = df.groupby(['class','section'])[cols].apply(lambda x: x.ffill().bfill())
print (df)
class section sub marks school city
0 I A Eng 80 jghss salem
1 I A Mat 90 jghss salem
2 I A Eng 50 jghss salem
3 III A Eng 80 gphss salem
4 III A Mat 45 gphss salem
5 III A Eng 40 gphss salem
6 III A Eng 20 gphss salem
7 III A Mat 55 gphss salem
Assuming that each pair of class
and section
corresponds to a unique pair of school
and city
, we can use groupby
:
# create a dictionary of class and section with school and city
# here we assume that for each pair and class there's a row with both school and city
# if that's not the case, we can separate the two series
school_city_dict = df[['class', 'section','school','city']].dropna().\
groupby(['class', 'section'])[['school','city']].\
max().to_dict()
# school_city_dict = {'school': {('I', 'A'): 'jghss', ('III', 'A'): 'gphss'},
# 'city': {('I', 'A'): 'salem', ('III', 'A'): 'salem'}}
# set index, prepare for map function
df.set_index(['class','section'], inplace=True)
df.loc[:,'school'] = df.index.map(school_city_dict['school'])
df.loc[:,'city'] = df.index.map(school_city_dict['city'])
# reset index to the original
df.reset_index()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With