Here is my example:
I first create dataframe and save it to file
import pandas as pd
df = pd.DataFrame({'col_1':[['a','b','s'], 23423]})
df.to_csv(r'C:\test.csv')
Then df.col_1[0]
returns ['a','b','s']
a list
Later I read it from file:
df_1 = pd.read_csv(r'C:\test.csv', quoting = 3, quotechar = '"')
Now df_1['col_1'][0]
returns "['a' 's']"
a string.
I would like to get list back. I am experimenting with different read_csv
settings, but so far no luck
index_col: This is to allow you to set which columns to be used as the index of the dataframe. The default value is None, and pandas will add a new column start from 0 to specify the index column. It can be set as a column name or column index, which will be used as the index column.
We can use the parse_dates parameter to convince pandas to turn things into real datetime types. parse_dates takes a list of columns (since you could want to parse multiple columns into datetimes ).
The difference between read_csv() and read_table() is almost nothing. In fact, the same function is called by the source: read_csv() delimiter is a comma character. read_table() is a delimiter of tab \t .
The default value of the sep parameter is the comma (,) which means if we don't specify the sep parameter in our read_csv() function, it is understood that our file is using comma as the delimiter.
You're not going to get the list back without a bit of work
Use literal_eval
to convert the lists
import ast
conv = dict(col_1=ast.literal_eval)
pd.read_csv(r'C:\test.csv', index_col=0, converters=conv).loc[0, 'col_1']
['a', 'b', 'c']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With