Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get list of rows with same name from dataframe using pandas

Was looking for a way to get the list of a partial row.

Name    x   y   r
  a     9   81  63
  a     98  5   89
  b     51  50  73
  b     41  22  14
  c     6   18  1
  c     1   93  55
  d     57  2   90
  d     58  24  20

So i was trying to get the dictionary as follows,

di = {a:{0: [9,81,63], 1: [98,5,89]},
    b:{0:[51,50,73], 1:[41,22,14]},
    c:{0:[6,18,1], 1:[1,93,55]},
    d:{0:[57,2,90], 1:[58,24,20]}}
like image 533
arshad92 Avatar asked Dec 24 '22 12:12

arshad92


2 Answers

Use groupby with custom function for count lists, last convert output Series to_dict:

di = (df.groupby('Name')['x','y','r']
        .apply(lambda x: dict(zip(range(len(x)),x.values.tolist())))
        .to_dict())

print (di)
{'b': {0: [51, 50, 73], 1: [41, 22, 14]}, 
 'a': {0: [9, 81, 63], 1: [98, 5, 89]}, 
 'c': {0: [6, 18, 1], 1: [1, 93, 55]}, 
 'd': {0: [57, 2, 90], 1: [58, 24, 20]}}

Detail:

print (df.groupby('Name')['x','y','r']
         .apply(lambda x: dict(zip(range(len(x)),x.values.tolist()))))
Name
a      {0: [9, 81, 63], 1: [98, 5, 89]}
b    {0: [51, 50, 73], 1: [41, 22, 14]}
c       {0: [6, 18, 1], 1: [1, 93, 55]}
d     {0: [57, 2, 90], 1: [58, 24, 20]}
dtype: object

Thank you volcano for suggestion use enumerate:

di = (df.groupby('Name')['x','y','r']
       .apply(lambda x: dict(enumerate(x.values.tolist())))
       .to_dict())

For better testing is possible use custom function:

def f(x):
    #print (x)
    a = range(len(x))
    b = x.values.tolist()
    print (a)
    print (b)
    return dict(zip(a,b))

[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[51, 50, 73], [41, 22, 14]]
range(0, 2)
[[6, 18, 1], [1, 93, 55]]
range(0, 2)
[[57, 2, 90], [58, 24, 20]]

di = df.groupby('Name')['x','y','r'].apply(f).to_dict()
print (di)
like image 77
jezrael Avatar answered Jan 26 '23 00:01

jezrael


Sometimes it is best to minimize the footprint and overhead.
Using itertools.count, collections.defaultdict

from itertools import count
from collections import defaultdict

counts = {k: count(0) for k in df.Name.unique()}
d = defaultdict(dict)

for k, *v in df.values.tolist():
    d[k][next(counts[k])] = v

dict(d)

{'a': {0: [9, 81, 63], 1: [98, 5, 89]},
 'b': {0: [51, 50, 73], 1: [41, 22, 14]},
 'c': {0: [6, 18, 1], 1: [1, 93, 55]},
 'd': {0: [57, 2, 90], 1: [58, 24, 20]}}
like image 41
piRSquared Avatar answered Jan 26 '23 01:01

piRSquared