How to use groupby.first() with transform function

Question

I would like to use the groupby.first() function to find the first non-null value of a group and transform that value to each row in the group.

I have tried the following code:

import pandas as pd
import numpy as np
raw_data = {'col1': ['a','a','a','b','b','b','b','b','b','c','c','c','c','c'],
            'col2': [np.nan,np.nan,6,0,2,0,8,2,2,3,0,0,4,5]}
df=pd.DataFrame(raw_data)
df['col3'] = df.groupby('col1')['col2'].transform(lambda x: x.first())
df

I would like to get a df that looks like this:

  col1 col2 col3
    a NaN   6
    a NaN   6
    a 6     6
    b 0     0
    b 2     0
    b 0     0
    b 8     0
    b 2     0
    b 2     0
    c 3     3
    c 0     3
    c 0     3
    c 4     3
    c 5     3

I get the following error: TypeError: first() missing 1 required positional argument: 'offset'

Interestingly, if I run the same code and just swap out first() for sum() then it returns the sum of each group for every row of that group. The first() function will not work. Why not? Any help would be greatly appreciated!

ALollz · Accepted Answer

With your lambda you are trying to use Series.first, which only makes sense for a Series with a DatetimeIndex.

You want GroupBy.first, which can be accessed with the named alias 'first'.

df['col3'] = df.groupby('col1')['col2'].transform('first')

How to use groupby.first() with transform function

Tags:

python

pandas

Will Bachrach

1 Answers

ALollz

Recent Activity

Donate For Us

How to use groupby.first() with transform function

Tags:

python

pandas

Will Bachrach

1 Answers

ALollz

Related questions

Recent Activity

Donate For Us