I would like to add column names to the results of a groupby on a <code>DataFrame</code> in Python 3.6. I tried this code: <pre class="prettyprint"><code>import pandas as pd d = {'timeIndex': [1, 1, 1, 1, 2, 2, 2], 'isZero': [0,0,0,1,0,0,0]} df = pd.DataFrame(data=d) df2 = df.groupby(['timeIndex'])['isZero'].sum() print(df2) </code></pre> Result <pre class="prettyprint"><code>timeIndex 1 1 2 0 Name: isZero, dtype: int64 </code></pre> It looks like <code>timeIndex</code> is a column heading, but attempts to address a column by name produce exceptions. <pre class="prettyprint"><code>df2['timeIndex'] # KeyError: 'timeIndex' df2['isZero'] # KeyError: 'isZero' </code></pre> I am looking for this result. <pre class="prettyprint"><code>df2 timeIndex isZero 0 1 1 1 2 0 df2['isZero'] 0 1 1 0 </code></pre>

Method 1: use the argument <code>as_index = False</code> in your <code>groupby</code>: <pre class="prettyprint"><code>df2 = df.groupby(['timeIndex'], as_index=False)['isZero'].sum() >>> df2 timeIndex isZero 0 1 1 1 2 0 >>> df2['isZero'] 0 1 1 0 Name: isZero, dtype: int64 </code></pre> Method 2: You can use <code>to_frame</code> with your desired column name and then <code>reset_index</code>: <pre class="prettyprint"><code>df2 = df.groupby(['timeIndex'])['isZero'].sum().to_frame('isZero').reset_index() >>> df2 timeIndex isZero 0 1 1 1 2 0 >>> df2['isZero'] 0 1 1 0 Name: isZero, dtype: int64 </code></pre>

Pandas - Add Column Name to Results of groupby [duplicate]

I would like to add column names to the results of a groupby on a DataFrame in Python 3.6.

I tried this code:

import pandas as pd
d = {'timeIndex': [1, 1, 1, 1, 2, 2, 2], 'isZero': [0,0,0,1,0,0,0]}
df = pd.DataFrame(data=d)
df2 = df.groupby(['timeIndex'])['isZero'].sum()
print(df2)

Result

timeIndex
1    1
2    0
Name: isZero, dtype: int64

It looks like timeIndex is a column heading, but attempts to address a column by name produce exceptions.

df2['timeIndex']
# KeyError: 'timeIndex'

df2['isZero']
# KeyError: 'isZero'

I am looking for this result.

df2 

     timeIndex    isZero
0    1    1
1    2    0

df2['isZero']

0    1
1    0

Does pandas allow duplicate column names?

Index objects are not required to be unique; you can have duplicate row or column labels.

How pandas handle duplicate columns?

To drop duplicate columns from pandas DataFrame use df. T. drop_duplicates(). T , this removes all columns that have the same data regardless of column names.

How can check duplicate column in pandas?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.

Method 1:

use the argument as_index = False in your groupby:

df2 = df.groupby(['timeIndex'], as_index=False)['isZero'].sum()

>>> df2
   timeIndex  isZero
0          1       1
1          2       0

>>> df2['isZero']
0    1
1    0
Name: isZero, dtype: int64

Method 2:

You can use to_frame with your desired column name and then reset_index:

df2 = df.groupby(['timeIndex'])['isZero'].sum().to_frame('isZero').reset_index()

>>> df2
   timeIndex  isZero
0          1       1
1          2       0

>>> df2['isZero']
0    1
1    0
Name: isZero, dtype: int64

Pandas - Add Column Name to Results of groupby [duplicate]

Tags:

python

pandas

dataframe

pandas-groupby

Jacob Quisenberry

People also ask

1 Answers

sacuL

Recent Activity

Donate For Us

Pandas - Add Column Name to Results of groupby [duplicate]

Tags:

python

pandas

dataframe

pandas-groupby

Jacob Quisenberry

People also ask

1 Answers

sacuL

Related questions

Recent Activity

Donate For Us