Assuming that I have a dataframe with the following values: <pre class="prettyprint"><code>df: col1 col2 value 1 2 3 1 2 1 2 3 1 </code></pre> I want to first groupby my dataframe based on the first two columns (col1 and col2) and then average over values of the thirs column (value). So the desired output would look like this: <pre class="prettyprint"><code>col1 col2 avg-value 1 2 2 2 3 1 </code></pre> I am using the following code: <pre class="prettyprint"><code>columns = ['col1','col2','avg'] df = pd.DataFrame(columns=columns) df.loc[0] = [1,2,3] df.loc[1] = [1,3,3] print(df[['col1','col2','avg']].groupby('col1','col2').mean()) </code></pre> which gets the following error: <pre class="prettyprint"><code>ValueError: No axis named col2 for object type <class 'pandas.core.frame.DataFrame'> </code></pre> Any help would be much appreciated.

You need to pass a list of the columns to groupby, what you passed was interpreted as the <code>axis</code> param which is why it raised an error: <pre class="prettyprint"><code>In [30]: columns = ['col1','col2','avg'] df = pd.DataFrame(columns=columns) df.loc[0] = [1,2,3] df.loc[1] = [1,3,3] print(df[['col1','col2','avg']].groupby(['col1','col2']).mean()) avg col1 col2 1 2 3 3 3 </code></pre>

Pandas dataframe: Group by two columns and then average over another column

Tags:

python

pandas

group-by

average

Assuming that I have a dataframe with the following values:

df:
col1    col2    value
1       2       3
1       2       1
2       3       1

I want to first groupby my dataframe based on the first two columns (col1 and col2) and then average over values of the thirs column (value). So the desired output would look like this:

col1    col2    avg-value
1       2       2
2       3       1

I am using the following code:

columns = ['col1','col2','avg']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
print(df[['col1','col2','avg']].groupby('col1','col2').mean())

which gets the following error:

ValueError: No axis named col2 for object type <class 'pandas.core.frame.DataFrame'>

Any help would be much appreciated.

748

asked Feb 23 '16 20:02

ahajib

2 Answers

You need to pass a list of the columns to groupby, what you passed was interpreted as the axis param which is why it raised an error:

In [30]:
columns = ['col1','col2','avg']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]

print(df[['col1','col2','avg']].groupby(['col1','col2']).mean())
           avg
col1 col2     
1    2       3
     3       3

111

answered Oct 23 '22 22:10

EdChum

If you want to group by multiple columns, you should put them in a list:

columns = ['col1','col2','value']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
df.loc[2] = [2,3,1]
print(df.groupby(['col1','col2']).mean())

Or slightly more verbose, for the sake of getting the word 'avg' in your aggregated dataframe:

import numpy as np
columns = ['col1','col2','value']
df = pd.DataFrame(columns=columns)
df.loc[0] = [1,2,3]
df.loc[1] = [1,3,3]
df.loc[2] = [2,3,1]
print(df.groupby(['col1','col2']).agg({'value': {'avg': np.mean}}))

answered Oct 23 '22 22:10

jkokorian

Related questions
                            
                                No module named utils error on compiling py file
                            
                                Combine columns in a Pandas DataFrame to a column of lists in a DataFrame
                            
                                Show entire toctree in Read The Docs sidebar
                            
                                Is it possible to use a function in an SQLAlchemy filter?
                            
                                conditional row read of csv in pandas
                            
                                Run npm commands using Python subprocess
                            
                                Attribute Error Installing with pip
                            
                                How to access id/widget of different class from a kivy file (.kv)?
                            
                                Make BeautifulSoup handle line breaks as a browser would
                            
                                Getting vertex list from python-igraph
                            
                                How to dynamically set default value in WTForms RadioField?
                            
                                Simulink for Python [closed]
                            
                                Python Pandas: How to move one row to the first row of a Dataframe?
                            
                                Converting all non-numeric to 0 (zero) in Python
                            
                                Fast way to split column into multiple rows in Pandas
                            
                                Are there rules for naming single-module Python packages?
                            
                                How to create custom objective function in Keras?
                            
                                Can't remove a file which created by `tempfile.mkstemp()` on Windows
                            
                                pytest: printing from fixture
                            
                                Python - Generate a dictionary(tree) from a list of tuples

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With