I have a DF like so:
Name Food Year_eaten Month_eaten
Maria Rice 2014 3
Maria Rice 2015 NaN
Maria Rice 2016 NaN
Jack Steak 2011 NaN
Jack Steak 2012 5
Jack Steak 2013 NaN
I want the output to look like this:
Name Food Year_eaten Month_eaten
Maria Rice 2014 3
Maria Rice 2015 3
Maria Rice 2016 3
Jack Steak 2011 5
Jack Steak 2012 5
Jack Steak 2013 5
I want to fill in the NaN's according to this condition:
If the row's Name, Food is the same and the Year's are consecutive:
Fill the NaN's with the Month_eaten corresponding to the row that isn't a NaN
There will be a person that has all NaN's for the Month_eaten, but I don't need to worry about that for now. Only the one's with at least one value for the Month_eaten in any of the years.
Any thoughts would be appreciated!
fillna() method is used to fill NaN/NA values on a specified column or on an entire DataaFrame with any given value. You can specify modify using inplace, or limit how many filling to perform or choose an axis whether to fill on rows/column etc.
You can group on "Name", "Food", and a custom column created by diff
ing the rows of "Year_eaten".
u = df.Year_eaten.diff().bfill().ne(1).cumsum()
v = df.groupby(['Name','Food', v]).Month_eaten.transform('first')
df['Month_eaten'] = df.Month_eaten.fillna(v, downcast='infer')
df
Name Food Year_eaten Month_eaten
0 Maria Rice 2014 3
1 Maria Rice 2015 3
2 Maria Rice 2016 3
3 Jack Steak 2011 5
4 Jack Steak 2012 5
5 Jack Steak 2013 5
Another solution, if no group has all rows with NaN, is using groupby
and ffill
(everything else is the same).
df['Month_eaten'] = df.groupby(['Name','Food', u]).Month_eaten.ffill().bfill()
df
Name Food Year_eaten Month_eaten
0 Maria Rice 2014 3
1 Maria Rice 2015 3
2 Maria Rice 2016 3
3 Jack Steak 2011 5
4 Jack Steak 2012 5
5 Jack Steak 2013 5
Using diff().ne(1).cumsum()
create the continue year group key
continueyear=df.groupby(['Name','Food']).Year_eaten.apply(lambda x : x.diff().ne(1).cumsum())
Then using groupby
with apply
ffill
and bfill
df.groupby([df.Name,df.Food,continueyear]).Month_eaten.apply(lambda x : x.ffill().bfill().astype(int))
Out[26]:
0 3
1 3
2 3
3 5
4 5
5 5
Name: Month_eaten, dtype: int32
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With