Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group pandas dataframe by a nested dictionary key

I have a pandas dataframe where one of the columns is dictionary type. This is an example dataframe:

import pandas as pd
df = pd.DataFrame({'a': [1,2,3], 
                   'b': [4,5,6], 
                   'version': [{'major': 7, 'minor':1}, 
                               {'major':8, 'minor': 5},
                               {'major':7, 'minor':2}] })

df:

   a  b                   version
0  1  4  {'minor': 1, 'major': 7}
1  2  5  {'minor': 5, 'major': 8}
2  3  6  {'minor': 2, 'major': 7}

I am looking for a way to group the dataframe by one of that dictionary key; in this case to group the df dataframe by the major key in version label.

I have tried a few different stuff, from passing the dictionary key to dataframe groupby function, `df.groupby(['version']['major']), which doesn't work since major is not part of dataframe label, to assigning version to the dataframe index, but nothing works so far. I'm also trying to flatten the dictionaries as additional columns in the dataframe itself, but this seems to have its own issue.

Any idea?

P.S. Sorry about formatting, it's my first stackoverflow question.

like image 533
RexIncognito Avatar asked Sep 18 '25 16:09

RexIncognito


2 Answers

Option 1

df.groupby(df.version.apply(lambda x: x['major'])).size()

version
7    2
8    1
dtype: int64

df.groupby(df.version.apply(lambda x: x['major']))[['a', 'b']].sum()

enter image description here

Option 2

df.groupby(df.version.apply(pd.Series).major).size()

major
7    2
8    1
dtype: int64

df.groupby(df.version.apply(pd.Series).major)[['a', 'b']].sum()

enter image description here

like image 76
piRSquared Avatar answered Sep 20 '25 05:09

piRSquared


you can do it this way:

In [15]: df.assign(major=df.version.apply(pd.Series).major).groupby('major').sum()
Out[15]:
       a   b
major
7      4  10
8      2   5
like image 29
MaxU - stop WAR against UA Avatar answered Sep 20 '25 06:09

MaxU - stop WAR against UA