Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas grouper issue with key that is an index

I have a pandas dataframe with the following form:

                Response
Time    
2018-01-14 00:00:00 201
2018-01-14 00:00:00 400
2018-01-14 00:00:00 200
2018-01-14 00:00:00 400
2018-01-14 00:00:00 200

Time is the index column.

I wanted to get graphs for the responses grouped over time (15 min intervals) so I wrote the following:

for ind, itm in enumerate(df_final['Response'].unique()):
    ax=df_final[df_final['Response'] == itm].groupby(pd.Grouper(key='Time',freq='15Min')).count().plot(kind='bar', figsize=(15,10), title="Response Codes")
    ax.legend(["Response: {}".format(itm)])

This worked with the depreciated TimeGrouper where the second line in the above code was:

ax=df_final[df_final['Response'] == item].groupby(pd.TimeGrouper(freq='15Min')).count().plot(kind='bar', figsize=(15,10), title="Response Codes")

but when I run the Grouper code I get the error:

KeyError: 'The grouper name Time is not found'

I also changed the key to be df_final.index.name but that also resulted in KeyError: 'The grouper name Time is not found'

The index was of type index but I changed it to DatetimeIndex:

type(df_final.index)

pandas.core.indexes.datetimes.DatetimeIndex

After I changed the index type and ran :

ax=df_final[df_final['Response'] == itm].groupby(pd.Grouper(key=df_final.index, freq='15Min')).count().plot(kind='bar', figsize=(15,10), title="Response Codes")

I got:

TypeError: unhashable type: 'DatetimeIndex'

I'm obviously missing something. What am I doing wrong here?

Just to show what the index is df_final.index gave the result:

DatetimeIndex(['2018-01-14 00:00:00', '2018-01-14 00:00:00',
           '2018-01-14 00:00:00', '2018-01-14 00:00:00',
           '2018-01-14 00:00:00', '2018-01-14 00:00:00',
           '2018-01-14 00:00:00', '2018-01-14 00:00:00',
           '2018-01-14 00:00:00', '2018-01-14 00:00:00',
           ...
           '2018-01-15 00:00:00', '2018-01-15 00:00:00',
           '2018-01-15 00:00:00', '2018-01-15 00:00:00',
           '2018-01-15 00:00:00', '2018-01-15 00:00:00',
           '2018-01-15 00:00:00', '2018-01-15 00:00:00',
           '2018-01-15 00:00:00', '2018-01-15 00:00:00'],
          dtype='datetime64[ns]', name='Time', length=48960011, freq=None)

after some investigation with the aid of jezrael it looks like the issue is in the plot method. I broke up the code into:

for ind, itm in enumerate(df_final['Response'].unique()):
    ax=df_final[df_final['Response'] == itm].groupby(pd.Grouper(level='Time', freq='15Min')).count()
    ax.plot(kind='bar', figsize=(15,10), title="Response Codes")

and the error occurs in the plot line is:

~/anaconda2/envs/py3env/lib/python3.6/site-packages/pandas/plotting/_core.py in __init__(self, data, kind, by, subplots, sharex, sharey, use_index, figsize, grid, legend, rot, ax, fig, title, xlim, ylim, xticks, yticks, sort_columns, fontsize, secondary_y, colormap, table, layout, **kwds)
     98                  table=False, layout=None, **kwds):
     99 
--> 100         _converter._WARN = False
    101         self.data = data
    102         self.by = by

NameError: name '_converter' is not defined

I dont know if I did something wrong or if there is an error in matplotlib but this is the position I find myself stuck on. The previous line ax shows counts and times as expected

like image 752
amadain Avatar asked Jan 16 '18 13:01

amadain


2 Answers

I think you need:

pd.Grouper(level='Time',freq='15Min')

I believe you can add column Response to groupby, reshape by unstack and plot:

a = df_final.groupby([pd.Grouper(level='Time',freq='15Min'), 'Response'])['Response'].count()
a.unstack().plot(kind='bar', figsize=(15,10), title="Response Codes")
like image 59
jezrael Avatar answered Oct 11 '22 18:10

jezrael


It seems that it was the matplotlib version that was the issue. When I went back to version 2.0.2 I had no issues. Just uninstall matplotlib version 2.1.1 using:

! pip uninstall -y matplotlib && pip install matplotlib==2.0.2

and import matplotlib again and the code all works

like image 43
amadain Avatar answered Oct 11 '22 18:10

amadain