I have the following csv with first row as header:
id,data
a,"{'1': 0.7778, '3': 0.5882, '2': 0.9524, '4': 0.5556}"
b,"{'1': 0.7778, '3': 0.5, '2': 0.7059, '4': 0.2222}"
c,"{'1': 0.8182, '3': 0.2609, '2': 0.5882}"
I need to get to something like this
id 1 2 3 4
a 0.7778 0.9524 0.5882 0.5556
b 0.7778 0.7059 0.5 0.2222
c 0.8182 0.5882 0.2609 NaN
where the keys of the dictionary are the columns.
How can I do this using pandas?
split() function is used to break up single column values into multiple columns based on a specified separator or delimiter. The Series. str. split() function is similar to the Python string split() method, but split() method works on the all Dataframe columns, whereas the Series.
Method 1: Split dictionary keys and values using inbuilt functions. Here, we will use the inbuilt function of Python that is . keys() function in Python, and . values() function in Python to get the keys and values into separate lists.
The str. split() function is used to split strings around given separator/delimiter. The function splits the string in the Series/Index from the beginning, at the specified delimiter string.
You can do this with Python's ast
module:
import ast
import pandas as pd
df = pd.read_csv('/path/to/your.csv')
dict_df = pd.DataFrame([ast.literal_eval(i) for i in df.data.values])
>>> dict_df
1 2 3 4
0 0.7778 0.9524 0.5882 0.5556
1 0.7778 0.7059 0.5000 0.2222
2 0.8182 0.5882 0.2609 NaN
df = df.drop('data',axis=1)
final_df = pd.concat([df,dict_df],axis=1)
>>> final_df
id 1 2 3 4
0 a 0.7778 0.9524 0.5882 0.5556
1 b 0.7778 0.7059 0.5000 0.2222
2 c 0.8182 0.5882 0.2609 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With