I have an n-by-m Pandas DataFrame <code>df</code> defined as follows. (I know this is not the best way to do it. It makes sense for what I'm trying to do in my actual code, but that would be TMI for this post so just take my word that this approach works in my particular scenario.) <pre class="prettyprint"><code>>>> df = DataFrame(columns=['col1']) >>> df.append(Series([None]), ignore_index=True) >>> df Empty DataFrame Columns: [col1] Index: [] </code></pre> I stored lists in the cells of this DataFrame as follows. <pre class="prettyprint"><code>>>> df['column1'][0] = [1.23, 2.34] >>> df col1 0 [1, 2] </code></pre> For some reason, the DataFrame stored this list as a string instead of a list. <pre class="prettyprint"><code>>>> df['column1'][0] '[1.23, 2.34]' </code></pre> I have 2 questions for you. <ol> <li>Why does the DataFrame store a list as a string and is there a way around this behavior?</li> <li>If not, then is there a Pythonic way to convert this string into a list?</li> </ol> <hr> Update The DataFrame I was using had been saved and loaded from a CSV format. This format, rather than the DataFrame itself, converted the list from a string to a literal.

As you pointed out, this can commonly happen when saving and loading pandas DataFrames as <code>.csv</code> files, which is a text format. In your case this happened because list objects have a string representation, allowing them to be stored as <code>.csv</code> files. Loading the <code>.csv</code> will then yield that string representation. If you want to store the actual objects, you should use <code>DataFrame.to_pickle()</code> (note: objects must be picklable!). To answer your second question, you can convert it back with <code>ast.literal_eval</code>: <pre class="prettyprint"><code>>>> from ast import literal_eval >>> literal_eval('[1.23, 2.34]') [1.23, 2.34] </code></pre>

Pandas DataFrame stored list as string: How to convert back to list

Tags:

python

string

list

pandas

dataframe

I have an n-by-m Pandas DataFrame df defined as follows. (I know this is not the best way to do it. It makes sense for what I'm trying to do in my actual code, but that would be TMI for this post so just take my word that this approach works in my particular scenario.)

>>> df = DataFrame(columns=['col1']) >>> df.append(Series([None]), ignore_index=True) >>> df Empty DataFrame Columns: [col1] Index: []

I stored lists in the cells of this DataFrame as follows.

>>> df['column1'][0] = [1.23, 2.34] >>> df      col1 0  [1, 2]

For some reason, the DataFrame stored this list as a string instead of a list.

>>> df['column1'][0] '[1.23, 2.34]'

I have 2 questions for you.

Why does the DataFrame store a list as a string and is there a way around this behavior?
If not, then is there a Pythonic way to convert this string into a list?

Update

The DataFrame I was using had been saved and loaded from a CSV format. This format, rather than the DataFrame itself, converted the list from a string to a literal.

274

asked Apr 16 '14 14:04

Gyan Veda

Video Answer

1 Answers

As you pointed out, this can commonly happen when saving and loading pandas DataFrames as .csv files, which is a text format.

In your case this happened because list objects have a string representation, allowing them to be stored as .csv files. Loading the .csv will then yield that string representation.

If you want to store the actual objects, you should use DataFrame.to_pickle() (note: objects must be picklable!).

To answer your second question, you can convert it back with ast.literal_eval:

>>> from ast import literal_eval >>> literal_eval('[1.23, 2.34]') [1.23, 2.34]

135

answered Oct 05 '22 21:10

anon582847382

Related questions
                            
                                What are "soft keywords"?
                            
                                Cell-var-from-loop warning from Pylint
                            
                                What are the risks of running 'sudo pip'?
                            
                                Can sphinx link to documents that are not located in directories below the root document?
                            
                                Dead simple example of using Multiprocessing Queue, Pool and Locking
                            
                                Copy constructor in python?
                            
                                Python - Join with newline
                            
                                How to implement an efficient bidirectional hash table?
                            
                                Python - difference between two strings
                            
                                Change values on matplotlib imshow() graph axis
                            
                                Using a global variable with a thread
                            
                                Frequency table for a single variable
                            
                                Check if string is upper, lower, or mixed case in Python
                            
                                Preserve case in ConfigParser?
                            
                                How to explode a list inside a Dataframe cell into separate rows
                            
                                How do I clear all variables in the middle of a Python script?
                            
                                Python - Convert a bytes array into JSON format
                            
                                Python: How to check a string for substrings from a list? [duplicate]
                            
                                Reading file using relative path in python project
                            
                                How do I use Django templates without the rest of Django?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With