I've been trying to figure out how I can return just the first group, after I apply groupby. My code looks like this: <pre class="prettyprint"><code>gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum() </code></pre> What I want is for that first first group to output. I've been trying the get_group method but it keeps failing (maybe because I am grouping by multiple columns?) Here is an example of my output: <pre class="prettyprint"><code>col1 col2 col3 col4 'sum' 1 34 green 10 0.0 yellow 30 1.5 orange 20 1.1 2 89 green 10 3.0 yellow 5 0.0 orange 10 1.0 </code></pre> What I want to be returned is just this: <pre class="prettyprint"><code>col1 col2 col3 col4 'sum' 1 34 green 10 0.0 yellow 30 1.5 orange 20 1.1 </code></pre> (Note the 'sum' column I just added here to make it clear what that last column was, but pandas does not actually name that column)

You can using <code>get_group</code> with <code>groups</code> <pre class="prettyprint"><code>g=df.groupby(['col1','col2']) g.get_group((list(g.groups)[0])).groupby(['col3','col4'])['col5'].sum() </code></pre>

How to get the first group in a groupby of multiple columns?

Tags:

pandas

pandas-groupby

I've been trying to figure out how I can return just the first group, after I apply groupby.

My code looks like this:

gb = df.groupby(['col1', 'col2', 'col3', 'col4'])['col5'].sum()

What I want is for that first first group to output. I've been trying the get_group method but it keeps failing (maybe because I am grouping by multiple columns?)

Here is an example of my output:

col1  col2  col3   col4  'sum'
 1     34   green   10    0.0
            yellow  30    1.5 
            orange  20    1.1 
 2     89   green   10    3.0 
            yellow   5    0.0 
            orange  10    1.0

What I want to be returned is just this:

col1  col2  col3   col4  'sum'
 1     34   green   10    0.0
            yellow  30    1.5 
            orange  20    1.1

(Note the 'sum' column I just added here to make it clear what that last column was, but pandas does not actually name that column)

748

asked Apr 12 '18 14:04

Hana

Video Answer

2 Answers

You can using get_group with groups

g=df.groupby(['col1','col2'])

g.get_group((list(g.groups)[0])).groupby(['col3','col4'])['col5'].sum()

152

answered Oct 11 '22 22:10

BENY

for group_id, group_df in df.groupby(['col1', 'col2', 'col3', 'col4']):
    break

iterate over your groupby object and stop after the first iteration. The variables group_id and group_df will contain your first group.

Kind of an ugly workaround but works.

answered Oct 12 '22 00:10

user2505961

Related questions
                            
                                Pandas: Creating new data frame from only certain columns
                            
                                How to replace inf in a numpy array with zero
                            
                                Convert a pandas groupby object to list of dataframes
                            
                                Python Pandas Plotting Two BARH side by side
                            
                                How do I resolve one hot encoding if my test data has missing values in a col?
                            
                                Convert pandas dataframe to numpy array - which approach to prefer? [duplicate]
                            
                                Subtract values from maximum value within groups
                            
                                pandas pivot and join in two dataframes
                            
                                Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd [duplicate]
                            
                                How to transform some columns only with SimpleImputer or equivalent
                            
                                Move every second row to row above in pandas dataframe
                            
                                Python Pandas: String Contains and Doesn't Contain
                            
                                Flatten double nested JSON
                            
                                Write GeoDataFrame into SQL Database
                            
                                How do I use dictionary keys and values to rename columns in a pandas DataFrame?
                            
                                map multiple columns by a single dictionary in pandas
                            
                                Find the name of the column in a Pandas DF which contains the longest list
                            
                                Insert a column at the beginning (leftmost end) of a DataFrame
                            
                                Finding common elements between multiple dataframe columns
                            
                                Pandas Dataframe Check if column value is in column list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With