Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas 'as_index' function doesn't work as expected

Tags:

python

pandas

This is a minimum reproducible example of my original dataframe called 'calls':

       phone_number    call_outcome   agent  call_number
0      83473306392   NOT INTERESTED  orange            0
1     762850680150  CALL BACK LATER  orange            1
2     476309275079   NOT INTERESTED  orange            2
3     899921761538  CALL BACK LATER     red            3
4     906739234066  CALL BACK LATER  orange            4

Writing this pandas command...

most_calls = calls.groupby('agent') \
.count().sort('call_number', ascending=False)

Returns this...

           phone_number  call_outcome  call_number
agent                                          
orange          2234          2234         2234
red             1478          1478         1478
black            750           750          750
green            339           339          339
blue             199           199          199

Which is correct, but for the fact that I want 'agent' to be a variable and not indexed.

I've used the as_index=False function on numerous occasions and am familiar with specifying axis=1. However in this instance it doesn't matter where or how I incorporate these parameters, every permutation returns an error.

These are some examples I've tried and the corresponding errors:

most_calls = calls.groupby('agent', as_index=False) \
.count().sort('call_number', ascending=False)

ValueError: invalid literal for long() with base 10: 'black'

And

most_calls = calls.groupby('agent', as_index=False, axis=1) \
.count().sort('call_number', ascending=False)

ValueError: as_index=False only valid for axis=0
like image 606
RDJ Avatar asked Jun 25 '15 12:06

RDJ


People also ask

What does as_index do in pandas?

When as_index=True the key(s) you use in groupby() will become an index in the new dataframe. The benefits you get when you set the column as index are: Speed. When you filter values based on the index column eg.

What does as_index false?

Setting as_index=False allow you to check the condition on a common column and not on an index, which is often way easier. At some point, you might come across KeyError when applying operations on groups.

What does Groupby as index do?

Groupby preserves the order of rows within each group. When calling apply, add group keys to index to identify pieces. Reduce the dimensionality of the return type if possible, otherwise return a consistent type.

How do you sort by Groupby in Python?

Sort within Groups of groupby() Result in DataFrameBy using DataFrame. sort_values() , you can sort DataFrame in ascending or descending order, before you use this first group the DataFrame rows by using DataFrame. groupby() method. Note that groupby preserves the order of rows within each group.


1 Answers

I believe that, irrespective of the groupby operation you've done, you just need to call reset_index to say that the index column should just be a regular column.

Starting with a mockup of your data:

import pandas as pd
calls = pd.DataFrame({
    'agent': ['orange', 'red'],
    'phone_number': [2234, 1478],
    'call_outcome': [2234, 1478],
})
>> calls
    agent   call_outcome    phone_number
0   orange  2234    2234
1   red     1478    1478

here is the operation you did with reset_index() appended:

>> calls.groupby('agent').count().sort('phone_number', ascending=False).reset_index()
    agent   call_outcome    phone_number
0   orange  1   1
1   red     1   1
like image 65
Ami Tavory Avatar answered Oct 21 '22 17:10

Ami Tavory