Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas ffill based on condition in another column

I have a pandas DataFrame as shown below.

df = pd.DataFrame({
    'date': ['2011-01-01', '2011-01-01', '2011-02-01', '2011-02-01', '2011-03-01', '2011-03-01', '2011-04-01', '2011-04-01'],
    'category': [1, 2, 1, 2, 1, 2, 1, 2],
    'rate': [0.5, 0.75, np.nan, np.nan, 1, 1.25, np.nan, np.nan]
})

I want to use ffill to forward fill the values of rate, except that I want each value to correspond also to the appropriate category. How can I get df to look like this?:

df
    category    date    rate
    1     2011-01-01    0.50
    2     2011-01-01    0.75
    1     2011-02-01    0.50
    2     2011-02-01    0.75
    1     2011-03-01    1.00
    2     2011-03-01    1.25
    1     2011-04-01    1.00
    2     2011-04-01    1.25
like image 800
Gaurav Bansal Avatar asked Feb 15 '18 21:02

Gaurav Bansal


1 Answers

Use groupby:

df.groupby('category').ffill()

Output:

   category        date  rate
0         1  2011-01-01  0.50
1         2  2011-01-01  0.75
2         1  2011-02-01  0.50
3         2  2011-02-01  0.75
4         1  2011-03-01  1.00
5         2  2011-03-01  1.25
6         1  2011-04-01  1.00
7         2  2011-04-01  1.25

If you have other columns with NaN that you don't want fill, then you can use this to just ffill NaN in rate column:

df['rate'] = df.groupby('category')['rate'].ffill()
like image 151
Scott Boston Avatar answered Oct 11 '22 11:10

Scott Boston