Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading pandas dataframe that contains dictionaries in cells from csv

I saved a pandas dataframe that looks like the following as a csv file.

    a
0 {'word': 5.7}
1 {'khfds': 8.34}

When I attempt to read the dataframe as shown below, I receive the following error.

df = pd.read_csv('foo.csv', index_col=0, dtype={'str': 'dict'})

TypeError: data type "dict" not understood

The heart of my question is how do I read the csv file to recover the dataframe in the same form as when it was created. I also have tried reading without the dtype={} as well as replacing 'dict' with alternatives such as 'dictionary', 'object', and 'str'.

like image 686
TommyTorty10 Avatar asked Jun 07 '18 00:06

TommyTorty10


People also ask

How do I turn a list of dictionaries into a pandas DataFrame?

Use pd. DataFrame. from_dict() to transform a list of dictionaries to pandas DatFrame. This function is used to construct DataFrame from dict of array-like or dicts.

How do you read a .CSV file into a dictionary in Python?

The best way to convert a CSV file to a Python dictionary is to create a CSV file object f using open("my_file. csv") and pass it in the csv. DictReader(f) method. The return value is an iterable of dictionaries, one per row in the CSV file, that maps the column header from the first row to the specific row value.

Can you store a dictionary in a pandas DataFrame?

We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.


1 Answers

CSV files may only contain text, so dictionaries are out of scope. Therefore, you need to read the text literally to convert to dict. One way is using ast.literal_eval:

import pandas as pd
from ast import literal_eval
from io import StringIO

mystr = StringIO("""a
{'word': 5.7}
{'khfds': 8.34}""")

df = pd.read_csv(mystr)

df['a'] = df['a'].apply(literal_eval)

print(df['a'].apply(lambda x: type(x)))

0    <class 'dict'>
1    <class 'dict'>
Name: a, dtype: object

However, I strongly recommend you do not use Pandas specifically to store pointers to dictionaries. Pandas works best with contiguous memory blocks, e.g. separate numeric data into numeric series.

like image 78
jpp Avatar answered Oct 01 '22 20:10

jpp