Python pandas: Add column to grouped DataFrame with method chaining

Tags:

First let say that i'm new to pandas .

I am trying to make a new column in a DataFrame. I am able to do this as shown in my example. But I want to do this by chaining methods, so I don't have to assign new variables. Let me first show what I want to achieve, and what I have done this so far:

In [1]:
import numpy as np
from pandas import Series,DataFrame
import pandas as pd

In [2]:
np.random.seed(10)
df=pd.DataFrame(np.random.randint(1,5,size=(10, 3)), columns=list('ABC'))
df

Out [2]:
A  B  C
2  2  1
4  1  2
4  1  2
2  1  2
2  3  1
2  1  3
1  3  1
4  1  1
4  4  3
1  4  3
In [3]:
filtered_DF = df[df['B']<2].copy()
grouped_DF = filtered_DF.groupby('A')
filtered_DF['C_Share_By_Group'] =filtered_DF.C.div(grouped_DF.C.transform("sum"))
filtered_DF

Out [3]:
A  B  C  C_Share_By_Group
4  1  2               0.4
4  1  2               0.4
2  1  2               0.4
2  1  3               0.6
4  1  1               0.2

I want to achieve the same thing by chaining methods. In R with dplyr package, I would be able to do something like:

df %>% 
  filter(B<2) %>%
  group_by(A) %>% 
  mutate('C_Share_By_Group'=C/sum(C))

In the pandas documentation it says that mutate in R(dplyr) is equal to assign in pandas, but assign doesn't work on a grouped object. When I try to assign something to grouped dataframe, I get an error:

"AttributeError: Cannot access callable attribute 'assign' of 'DataFrameGroupBy' objects, try using the 'apply' method"

I have tried the following, but don't know how to add the new column, or if it is even possible to achieve this by chaining methods:

(df.loc[df.B<2]
   .groupby('A')
    #****WHAT GOES HERE?**** apply(something)?
)

613

asked May 10 '16 14:05

LauH

1 Answers

You can try assign:

print df[df['B']<2].assign(C_Share_By_Group=lambda df: 
                       df.C
                         .div(df.groupby('A')
                           .C
                           .transform("sum")))

   A  B  C  C_Share_By_Group
1  4  1  2               0.4
2  4  1  2               0.4
3  2  1  2               0.4
5  2  1  3               0.6
7  4  1  1               0.2

180

answered Oct 16 '22 18:10

jezrael

Related questions
                            
                                How do I return a value when @click.option is used to pass a command line argument to a function?
                            
                                Nu is infeasible
                            
                                Why can't I create a default, ordered dict by inheriting OrderedDict and defaultdict?
                            
                                AttributeError: 'unicode' object has no attribute 'values' when parsing JSON dictionary values
                            
                                Python classes: Inheritance vs Instantiation
                            
                                How do you get the display width of combined Unicode characters in Python 3?
                            
                                Iterating through a Spark RDD
                            
                                Different behavior in python script and python idle?
                            
                                Flask CORS - no Access-control-allow-origin header present on a redirect()
                            
                                Why use __unicode__(self) method for django 1.7+? [closed]
                            
                                In PyQt, what is the best way to share data between the main window and a thread
                            
                                How do I correctly inherit templates in flask that use bootstrap?
                            
                                How do I get union keys of `a` and `b` dictionary and 'a' values? [duplicate]
                            
                                How do I update Kivy elements from a thread?
                            
                                What is the best way for a class to reference itself in a class attribute?
                            
                                How to register "atexit" function in python's multiprocessing subprocess?
                            
                                Keep the order of list in sql pagination
                            
                                An Object is created twice in Python
                            
                                Paraview: Changing aspect ratio of axes in rendering window
                            
                                How to get exit code from subprocess.Popen?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python pandas: Add column to grouped DataFrame with method chaining

Tags:

python

pandas

dataframe

python-2.7

LauH

People also ask

1 Answers

jezrael

Recent Activity

Donate For Us