missing column after pandas groupby

Tags:

I've got a pandas dataframe df. I group it by 3 columns, and count the results. When I do this I lose some information, specifically, the name column. This column is mapped 1:1 with the desk_id column. Is there anyway to include both in my final dataframe?

here is the dataframe:

   shift_id    shift_start_time      shift_end_time        name                   end_time       desk_id  shift_hour
0  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:16:41.040000  15557987           2
1  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:16:41.096000  15557987           2
2  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 10:52:17.402000  15557987           2
3  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 11:06:59.083000  15557987           3
4  37423064 2014-01-17 08:00:00 2014-01-17 12:00:00  Adam Scott 2014-01-17 08:27:57.998000  15557987           0

I group it like this:

grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour']).size()
grouped = grouped.reset_index()

And here is the result, missing the name column.

    desk_id  shift_id  shift_hour  0
0  14468690  37729081           0  7
1  14468690  37729081           1  3
2  14468690  37729081           2  6
3  14468690  37729081           3  5
4  14468690  37729082           0  5

Also, anyway to rename the count column as 'count' instead of '0'?

776

asked Jun 27 '14 16:06

user3439329

1 Answers

You need to include 'name' in groupby by groups:

In [43]:

grouped = df.groupby(['desk_id', 'shift_id', 'shift_hour', 'name']).size()
grouped = grouped.reset_index()
grouped.columns=np.where(grouped.columns==0, 'count', grouped.columns) #replace the default 0 to 'count'
print grouped
    desk_id  shift_id  shift_hour        name  count
0  15557987  37423064           0  Adam Scott      1
1  15557987  37423064           2  Adam Scott      3
2  15557987  37423064           3  Adam Scott      1

If the name-to-id relationship is a many-to-one type, say we have a pete scott for the same set of data, the result will become:

    desk_id  shift_id  shift_hour        name  count
0  15557987  37423064           0  Adam Scott      1
1  15557987  37423064           0  Pete Scott      1
2  15557987  37423064           2  Adam Scott      3
3  15557987  37423064           2  Pete Scott      3
4  15557987  37423064           3  Adam Scott      1
5  15557987  37423064           3  Pete Scott      1

132

answered Oct 11 '22 17:10

CT Zhu

Related questions
                            
                                Get built in method signature - Python
                            
                                Django Iframe Safari Fix
                            
                                Python PyAudio installation problems (with PortAudio)
                            
                                python: pickling c objects
                            
                                Can't install numpy on Mountain Lion
                            
                                Is there an official/unofficial way to program Unity scripts in python? [closed]
                            
                                How to turn off transparency in Matplotlib's 3D Scatter plot?
                            
                                Fibonacci Rabbits Dying After Arbitrary # of Months
                            
                                Converting a python script to a web application [closed]
                            
                                What is uWSGI master mode?
                            
                                How to show source code of a package function in IPython notebook
                            
                                How to create a function at runtime with specified argument names?
                            
                                Fast detection or simulation of WSAECONNREFUSED
                            
                                Pycharm set the correct environment variable PATH
                            
                                Python easy_install in a virtualenv gives setuptools error
                            
                                Why do I get a "404 Not Found" error even though the link is on the server?
                            
                                How to install python-devel when using virtualenv
                            
                                Storing pandas DataFrames in SQLAlchemy models
                            
                                Getting "If-Match or If-None-Match header or entry etag attribute required" errors when batch deleting contacts
                            
                                Is there any analogue of EXIT_SUCCESS and EXIT_FAILURE macros in Python 2.7.6

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

missing column after pandas groupby

Tags:

python

pandas

dataframe

group-by

user3439329

People also ask

1 Answers

CT Zhu

Recent Activity

Donate For Us