Say I have three dictionaries
dictionary_col2
{'MOB': [1, 2], 'ASP': [1, 2], 'YIP': [1, 2]}
dictionary_col3
{'MOB': ['MOB_L001_R1_001.gz',
'MOB_L002_R1_001.gz'],
'ASP': ['ASP_L001_R1_001.gz',
'ASP_L002_R1_001.gz'],
'YIP': ['YIP_L001_R1_001.gz',
'YIP_L002_R1_001.gz']}
dictionary_col4
{'MOB': ['MOB_L001_R2_001.gz',
'MOB_L002_R2_001.gz'],
'ASP': ['ASP_L001_R2_001.gz',
'ASP_L002_R2_001.gz'],
'YIP': ['YIP_L001_R2_001.gz',
'YIP_L002_R2_001.gz']}
I wanna convert the above dictionaries into a data frame. I have tried the following,
df = pd.DataFrame([dictionary_col2, dictionary_col3, dictionary_col4])
The df
data frame looks like,
ASP MOB YIP
0 [1, 2] [1, 2] [1, 2]
1 [ASP_L001_R1_001.gz, ASP_L002_R1_001.gz] [MOB_L001_R1_001.gz, MOB_L002_R1_001.gz] [YIP_L001_R1_001.gz, YIP_L002_R1_001.gz]
2 [ASP_L001_R2_001.gz, ASP_L002_R2_001.gz] [MOB_L001_R2_001.gz, MOB_L002_R2_001.gz] [YIP_L001_R2_001.gz, YIP_L002_R2_001.gz]
My aim is to have a data frame with the following columns:
col1 col2 col3 col4
MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
ASP 2 ASP_L002_R1_001.gz MOB_L002_R2_001.gz
YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
Any help/suggestions are appreciated!!
DataFrame is a two-dimensional pandas data structure, which is used to represent the tabular data in the rows and columns format. We can create a pandas DataFrame object by using the python list of dictionaries.
Pandas DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It is generally the most commonly used pandas object. Pandas DataFrame can be created in multiple ways using Python. Let's discuss how to create a Pandas DataFrame from the List of Dictionaries.
On Initialising a DataFrame object with this kind of dictionary, each item (Key / Value pair) in dictionary will be converted to one column i.e. key will become Column Name and list in the value field will be the column data i.e.
pd.DataFrame({'col2': pd.DataFrame(col2).unstack(),
'col3': pd.DataFrame(col3).unstack(),
'col4': pd.DataFrame(col4).unstack()}).reset_index(level=0)
returns
level_0 col2 col3 col4
0 ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
1 ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz
0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
0 YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
1 YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
IIUC, you can do:
pd.concat([pd.DataFrame(d).stack() for d in (d1,d2,d3)], axis=1)
Output:
0 1 2
0 MOB 1 MOB_L001_R1_001.gz MOB_L001_R2_001.gz
ASP 1 ASP_L001_R1_001.gz ASP_L001_R2_001.gz
YIP 1 YIP_L001_R1_001.gz YIP_L001_R2_001.gz
1 MOB 2 MOB_L002_R1_001.gz MOB_L002_R2_001.gz
ASP 2 ASP_L002_R1_001.gz ASP_L002_R2_001.gz
YIP 2 YIP_L002_R1_001.gz YIP_L002_R2_001.gz
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With