Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe to dict of list of tuples

Suppose I have the following dataframe:

df = pd.DataFrame({'id': [1,2,3,3,3], 'v1': ['a', 'a', 'c', 'c', 'd'], 'v2': ['z', 'y', 'w', 'y', 'z']})
df
id  v1  v2
1   a   z
2   a   y
3   c   w
3   c   y
3   d   z

And I want to transform it to this format:

{1: [('a', 'z')], 2: [('a', 'y')], 3: [('c', 'w'), ('c', 'y'), ('d', 'z')]}

I basically want to create a dict where the keys are the id and the values is a list of tuples of the (v1,v2) of this id.

I tried using groupby in id:

df.groupby('id')[['v1', 'v2']].apply(list)

But this didn't work

like image 839
Bruno Mello Avatar asked Dec 13 '22 08:12

Bruno Mello


1 Answers

Create tuples first and then pass to groupby with aggregate list:

d = df[['v1', 'v2']].agg(tuple, 1).groupby(df['id']).apply(list).to_dict()
print (d)
{1: [('a', 'z')], 2: [('a', 'y')], 3: [('c', 'w'), ('c', 'y'), ('d', 'z')]}

Another idea is using MultiIndex:

d = df.set_index(['v1', 'v2']).groupby('id').apply(lambda x: x.index.tolist()).to_dict()
like image 56
jezrael Avatar answered Dec 29 '22 12:12

jezrael