I want to use <code>groupby.agg</code> where my group is the entire dataframe. Put another way, I want to use the <code>agg</code> functionality, without the groupby. I've looked for an example of this, but can not find it. Here's what I've done: <pre class="prettyprint"><code>import pandas as pd import numpy as np np.random.seed([3,1415]) df = pd.DataFrame(np.random.rand(6, 4), columns=list('ABCD')) df </code></pre> <img src="https://i.stack.imgur.com/GLb3X.png" alt="df"> <pre class="prettyprint"><code>def describe(df): funcs = dict(Kurt=lambda x: x.kurt(), Skew='skew', Mean='mean', Std='std') one_group = [True for _ in df.index] funcs_for_all = {k: funcs for k in df.columns} return df.groupby(one_group).agg(funcs_for_all).iloc[0].unstack().T describe(df) </code></pre> <img src="https://i.stack.imgur.com/jC65x.png" alt="enter image description here"> <h3>Question</h3> How was I supposed to have done this?

A small compaction of your own proposal, which I think improves readability, by exploiting that <code>DataFrame.groupby()</code> accepts a lambda function: <pre class="prettyprint"><code>def describe(df): funcs = dict(Kurt=lambda x: x.kurt(), Skew='skew', Mean='mean', Std='std') funcs_for_all = {k: funcs for k in df.columns} return df.groupby(lambda _ : True).agg(funcs_for_all).iloc[0].unstack().T describe(df) </code></pre>

Custom describe or aggregate without groupby

Tags:

python

pandas

I want to use groupby.agg where my group is the entire dataframe. Put another way, I want to use the agg functionality, without the groupby. I've looked for an example of this, but can not find it.

Here's what I've done:

import pandas as pd
import numpy as np

np.random.seed([3,1415])

df = pd.DataFrame(np.random.rand(6, 4), columns=list('ABCD'))
df

def describe(df):
    funcs = dict(Kurt=lambda x: x.kurt(),
                 Skew='skew',
                 Mean='mean',
                 Std='std')
    one_group = [True for _ in df.index]
    funcs_for_all = {k: funcs for k in df.columns}
    return df.groupby(one_group).agg(funcs_for_all).iloc[0].unstack().T

describe(df)

enter image description here

Question

How was I supposed to have done this?

954

asked Jul 04 '16 07:07

piRSquared

1 Answers

A small compaction of your own proposal, which I think improves readability, by exploiting that DataFrame.groupby() accepts a lambda function:

def describe(df):
    funcs = dict(Kurt=lambda x: x.kurt(),
                 Skew='skew',
                 Mean='mean',
                 Std='std')
    funcs_for_all = {k: funcs for k in df.columns}
    return df.groupby(lambda _ : True).agg(funcs_for_all).iloc[0].unstack().T

describe(df)

165

answered Oct 22 '22 15:10

kidmose

Related questions
                            
                                pandas: Boolean indexing with multi index
                            
                                testing click python applications
                            
                                Store functions in list and call them later
                            
                                Why python debugger always get this timeout waiting for response on 113 when using Pycharm?
                            
                                Python pandas dataframe - any way to set frequency programmatically?
                            
                                How does this function to remove duplicate characters from a string in python work?
                            
                                Merging a pandas groupby result back into DataFrame
                            
                                Open a new scratch file in PyCharm?
                            
                                Why does my Sieve of Eratosthenes work faster with integers than with booleans?
                            
                                Django Signals: using update_field as condition
                            
                                Per-class constants in Python
                            
                                How to test Pl/Python PostgreSQL procedures with Travis CI?
                            
                                convert Integers to RGB values and back with Python
                            
                                Airflow not scheduling Correctly Python
                            
                                How to limit query results with Django Rest filters
                            
                                pandas iterrows changes ints into floats
                            
                                Making a Jupyter notebook output cell fullscreen
                            
                                _pickle.UnpicklingError: could not find MARK
                            
                                Why iterator is considered functional-style in the Python documentation?
                            
                                Tips for properly using large broadcast variables?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With