I have the following dataframe:
date country
6/1/18 USA
6/1/18 BEL
6/4/18 USA
6/5/18 BEL
6/6/18 USA
I want to create a column that tells you what the next date for the corresponding country is. You can assume the dates are ordered if it helps. If the date is the last date for the country, you can fill in next date with the same date or a null value.
date country next_date
6/1/18 USA 6/4/18
6/1/18 BEL 6/5/18
6/4/18 USA 6/6/18
6/5/18 BEL 6/5/18
6/6/18 USA 6/6/18
You may use groupby and shift to fill in the next available value, however, this will leave rows without a next available date as NaN:
df.assign(ndate=df.groupby('country').date.shift(-1))
date country ndate
0 6/1/18 USA 6/4/18
1 6/1/18 BEL 6/5/18
2 6/4/18 USA 6/6/18
3 6/5/18 BEL NaN
4 6/6/18 USA NaN
If you would like to fill in these values with the last seen date, you may simply fillna with your date column:
df.assign(ndate=df.groupby('country').date.shift(-1)).fillna({'ndate': df.date})
date country ndate
0 6/1/18 USA 6/4/18
1 6/1/18 BEL 6/5/18
2 6/4/18 USA 6/6/18
3 6/5/18 BEL 6/5/18
4 6/6/18 USA 6/6/18
You can also use transform. I have opted to leave the final dates as NaN values.
>>> df.assign(next_date=df.groupby('country')['date'].transform(
lambda group: group.shift(-1)))
date country next_date
0 6/1/18 USA 6/4/18
1 6/1/18 BEL 6/5/18
2 6/4/18 USA 6/6/18
3 6/5/18 BEL NaN
4 6/6/18 USA NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With