I have the following dict, with keys as tuples:
d = {('first', 'row'): 3, ('second', 'row'): 1}
I'd like to create a dataframe with 3 columns: Col1, Col2 and Col3 which should look like this:
Col1 Col2 Col3
first row 3
second row 4
I can't figure out how to split the tuples other than parsing the dict pair by pair.
Construct a Series first, then resetting the index will give you a DataFrame:
pd.Series(d).reset_index()
Out:
level_0 level_1 0
0 first row 3
1 second row 1
You can rename columns afterwards:
df = pd.Series(d).reset_index()
df.columns = ['Col1', 'Col2', 'Col3']
df
Out:
Col1 Col2 Col3
0 first row 3
1 second row 1
Or in one-line, first naming the MultiIndex:
pd.Series(d).rename_axis(['Col1', 'Col2']).reset_index(name='Col3')
Out[7]:
Col1 Col2 Col3
0 first row 3
1 second row 1
Not that elegant as @ayhan's solution:
In [21]: pd.DataFrame(list(d), columns=['Col1','Col2']).assign(Col3=d.values())
Out[21]:
Col1 Col2 Col3
0 first row 3
1 second row 1
or a straightforward one:
In [27]: pd.DataFrame([[k[0],k[1],v] for k,v in d.items()]) \
.rename(columns={0:'Col1',1:'Col2',2:'Col2'})
Out[27]:
Col1 Col2 Col2
0 first row 3
1 second row 1
I was curious if it were possible to use MultiIndexes, so I made an attempt. This may have its benefits if you want to specify levels. But simply following the pandas documentation example ( MultiIdex) I came up with an alternative solution.
First I created a dictionary of random data
s = {(1,2):"a", (4,5):"b", (1,5):"w", (2, 3):"z", (4,1):"p"}
Then I used pd.MultiIndex
to create a Hierarchical index from the dictionary's keys.
index = pd.MultiIndex.from_tuples(s.keys())
index
Out[3]:
MultiIndex(levels=[[1, 2, 4], [1, 2, 3, 5]],
labels=[[0, 2, 2, 1, 0], [1, 3, 0, 2, 3]])
Then, I pass the dictionary's values directly to a pandas Series, and explicitly set the index to be the MultiIndex object I created above.
pd.Series(s.values(), index=index)
Out[4]:
1 2 a
4 5 b
1 p
2 3 z
1 5 w
dtype: object
Lastly, I reset the index to get the solution requested by OP
pd.Series(s.values(), index=index).reset_index()
Out[5]:
level_0 level_1 0
0 1 2 a
1 4 5 b
2 4 1 p
3 2 3 z
4 1 5 w
This was a bit more involved, so @ayhan's answer may still be preferable, but I think this gives you an idea of what pandas may be doing in the background. Or at least give anyone the opportunity to tinker with pandas' mechanics a bit more.
You can easily create a data frame form a dict
:
import pandas as pd
d = {('first', 'row'): 3, ('second', 'row'): 1}
df = pd.DataFrame.from_dict({'col': d}, orient='columns')
df
| | col |
------ | --- | --- |
first | row | 3 |
second | row | 1 |
Now for cosmetic purposes, you can get your output dataframe with:
df = df.reset_index()
df.columns = 'Col1 Col2 Col3'.split()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With