I have a dict of lists of tuples of the form:
{identifier1:[(date1,value1),
(date2,value2)],
identifier2:[(date1,value1),
(date3,value3),
(date4,value4)]
}
I'm trying to parse this into a dataframe but the lists are of different lengths and the tuples have duplicate values. The shape I want is three columns identifier, date and value where there are no nan values. I have tried various combinations such as using from_dict
method with very little success.
You can use list comprehension with DataFrame
constructor (python 3
):
d = {'identifier1':[('date1','value1'),('date2','value2')],
'identifier2':[('date1','value1'),('date3','value3'),('date4','value4')]}
L = [(k, *t) for k, v in d.items() for t in v]
df = pd.DataFrame(L, columns=['identifier','date','val'])
print (df)
identifier date val
0 identifier1 date1 value1
1 identifier1 date2 value2
2 identifier2 date1 value1
3 identifier2 date3 value3
4 identifier2 date4 value4
For python 2
use:
L = [(k, t[0], t[1]) for k, v in d.items() for t in v]
df = pd.DataFrame(L, columns=['identifier','date','val'])
print (df)
identifier date val
0 identifier1 date1 value1
1 identifier1 date2 value2
2 identifier2 date1 value1
3 identifier2 date3 value3
4 identifier2 date4 value4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With