Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas groupby based on another dataframe

I have two dataframes with a common index. I would like to group df1 based on a subset of columns in df2.

I know how to groupby multiple columns already in df1, like df1.groupby(['col1', 'col2']) and I know how to group on a different series with the same index, like df1.groupby(df2['col1']). Is there an immediate way to do something like

>>> df1.groupby(df[['col1', 'col2']])
# ValueError: Grouper for '<class 'pandas.core.frame.DataFrame'>' not 1-dimensional

Of course, I could do

df1.groupby([df2['col1'], df2['col2']])

but it seems there should be a more direct syntax for this. (Imagine having several grouping columns, etc.)

like image 670
Stalpotaten Avatar asked May 24 '26 18:05

Stalpotaten


2 Answers

How about:

gbobj = pd.concat([df1, df2[['col1','col2']], axis=1).groupby(['col1','col2'])

It could be either merge, join or concat the two dataframes and then group or a "more direct syntax" using a list comprehension, e.g:

many_grouping_columns = ['A', 'B', ...]  # columns found in in df2
df1.groupby([df2[col] for col in many_grouping_columns])
like image 25
sply88 Avatar answered May 27 '26 09:05

sply88



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!