Edit dataframe entries using groupby object --pandas

Question

Consider the following dataframe:

     index      count     signal
       1          1         1
       2          1        NAN
       3          1        NAN
       4          1        -1
       5          1        NAN
       6          2        NAN
       7          2        -1
       8          2        NAN
       9          3        NAN
       10         3        NAN
       11         3        NAN
       12         4        1
       13         4        NAN
       14         4        NAN

I need to 'ffill' the NANs in 'signal' and values with different 'count' value should not affect each other. such that I should get the following dataframe:

     index      count     signal
       1          1         1
       2          1         1
       3          1         1
       4          1        -1
       5          1        -1
       6          2        NAN
       7          2        -1
       8          2        -1
       9          3        NAN
       10         3        NAN
       11         3        NAN
       12         4        1
       13         4        1
       14         4        1

Right now I iterate through each data frame in group by object and fill NAN value and then copy to a new data frame:

new_table = np.array([]); 
for key, group in df.groupby('count'):
    group['signal'] = group['signal'].fillna(method='ffill')
    group1 = group.copy()
    if new_table.shape[0]==0:
        new_table = group1
    else:
        new_table = pd.concat([new_table,group1])

which kinda works, but really slow considering the data frame is large. I am wondering if there is any other method to do it with or without groupby methods. Thanks!

EDITED:

Thanks to Alexander and jwilner for providing alternative methods. However both methods are very slow for my big dataframe which has 800,000 rows of data.

jwilner · Accepted Answer

Use the apply method.

In [56]: df = pd.DataFrame({"count": [1] * 4 + [2] * 5 + [3] * 2 , "signal": [1] + [None] * 4 + [-1] + [None] * 5})

In [57]: df
Out[57]:
    count  signal
0       1       1
1       1     NaN
2       1     NaN
3       1     NaN
4       2     NaN
5       2      -1
6       2     NaN
7       2     NaN
8       2     NaN
9       3     NaN
10      3     NaN

[11 rows x 2 columns]

In [58]: def ffill_signal(df):
   ....:     df["signal"] = df["signal"].ffill()
   ....:     return df
   ....:

In [59]: df.groupby("count").apply(ffill_signal)
Out[59]:
    count  signal
0       1       1
1       1       1
2       1       1
3       1       1
4       2     NaN
5       2      -1
6       2      -1
7       2      -1
8       2      -1
9       3     NaN
10      3     NaN

[11 rows x 2 columns]

However, be aware that groupby reorders stuff. If the count column doesn't always stay the same or increase, but instead can have values repeated in it, groupby might be problematic. That is, given a count series like [1, 1, 2, 2, 1], groupby will group like so: [1, 1, 1], [2, 2], which could have possibly undesirable effects on your forward filling. If that were undesired, you'd have to create a new series to use with groupby that always stayed the same or increased according to changes in the count series -- probably using pd.Series.diff and pd.Series.cumsum

Nicola Miotto · Answer

I know it's very late, but I found a solution that is much faster than those proposed, namely to collect the updated dataframes in a list and do the concatenation only at the end. To take your example:

new_table = [] 
for key, group in df.groupby('count'):
    group['signal'] = group['signal'].fillna(method='ffill')
    group1 = group.copy()
    if new_table.shape[0]==0:
        new_table = [group1]
    else:
        new_table.append(group1)

new_table = pd.concat(new_table).reset_index(drop=True)

Edit dataframe entries using groupby object --pandas

Tags:

python

pandas

dataframe

group-by

user6396

2 Answers

jwilner

Nicola Miotto

Recent Activity

Donate For Us

Edit dataframe entries using groupby object --pandas

Tags:

python

pandas

dataframe

group-by

user6396

2 Answers

jwilner

Nicola Miotto

Related questions

Recent Activity

Donate For Us