I have a csv file with 3 columns, wherein each row of Column 3 has list of values in it. As you can see from the following table structure <pre class="prettyprint"><code>Col1,Col2,Col3 1,a1,"['Proj1', 'Proj2']" 2,a2,"['Proj3', 'Proj2']" 3,a3,"['Proj4', 'Proj1']" 4,a4,"['Proj3', 'Proj4']" 5,a5,"['Proj5', 'Proj2']" </code></pre> Whenever I try to read this csv, Col3 is getting read as str object and not as list. I tried to alter the dtype of that column to list but got "Attribute Error" as below <pre class="prettyprint"><code>df = pd.read_csv("inputfile.csv") df.Col3.dtype = list AttributeError Traceback (most recent call last) <ipython-input-19-6f9ec76b1b30> in <module>() ----> 1 df.Col3.dtype = list C:\Python27\lib\site-packages\pandas\core\generic.pyc in __setattr__(self, name, value) 1953 object.__setattr__(self, name, value) 1954 except (AttributeError, TypeError): -> 1955 object.__setattr__(self, name, value) 1956 1957 #---------------------------------------------------------------------- </code></pre> AttributeError: can't set attribute It would be really great if you can guide me how to go about it.

You could use the ast lib: <pre class="prettyprint"><code>from ast import literal_eval df.Col3 = df.Col3.apply(literal_eval) print(df.Col3[0][0]) Proj1 </code></pre> You can also do it when you create the dataframe from the csv, using <code>converters</code>: <pre class="prettyprint"><code>df = pd.read_csv("in.csv",converters={"Col3": literal_eval}) </code></pre> If you are sure the format is he same for all strings, stripping and splitting will be a lot faster: <pre class="prettyprint"><code> df = pd.read_csv("in.csv",converters={"Col3": lambda x: x.strip("[]").split(", ")}) </code></pre> But you will end up with the strings wrapped in quotes

How to read a column of csv as dtype list using pandas?

Tags:

python

pandas

csv

I have a csv file with 3 columns, wherein each row of Column 3 has list of values in it. As you can see from the following table structure

Col1,Col2,Col3 1,a1,"['Proj1', 'Proj2']" 2,a2,"['Proj3', 'Proj2']" 3,a3,"['Proj4', 'Proj1']" 4,a4,"['Proj3', 'Proj4']" 5,a5,"['Proj5', 'Proj2']"

Whenever I try to read this csv, Col3 is getting read as str object and not as list. I tried to alter the dtype of that column to list but got "Attribute Error" as below

df = pd.read_csv("inputfile.csv") df.Col3.dtype = list  AttributeError                            Traceback (most recent call last) <ipython-input-19-6f9ec76b1b30> in <module>() ----> 1 df.Col3.dtype = list  C:\Python27\lib\site-packages\pandas\core\generic.pyc in __setattr__(self,         name, value)    1953                     object.__setattr__(self, name, value)    1954             except (AttributeError, TypeError): -> 1955                 object.__setattr__(self, name, value)    1956     1957     #----------------------------------------------------------------------

AttributeError: can't set attribute

It would be really great if you can guide me how to go about it.

708

asked Sep 23 '15 14:09

nachiappanpl

1 Answers

You could use the ast lib:

from ast import literal_eval   df.Col3 = df.Col3.apply(literal_eval) print(df.Col3[0][0]) Proj1

You can also do it when you create the dataframe from the csv, using converters:

df = pd.read_csv("in.csv",converters={"Col3": literal_eval})

If you are sure the format is he same for all strings, stripping and splitting will be a lot faster:

 df = pd.read_csv("in.csv",converters={"Col3": lambda x: x.strip("[]").split(", ")})

But you will end up with the strings wrapped in quotes

answered Sep 21 '22 21:09

Padraic Cunningham

Related questions
                            
                                How to use pip with python 3.4 on windows?
                            
                                NameError: name 'exit' is not defined
                            
                                PyTorch / Gensim - How to load pre-trained word embeddings
                            
                                AttributeError: module 'html.parser' has no attribute 'HTMLParseError'
                            
                                I'm getting an IndentationError. How do I fix it?
                            
                                Is there a difference between `board[x, y]` and `board[x][y]` in Python?
                            
                                How to replace the first occurrence of a regular expression in Python?
                            
                                How to read class attributes in the same order as declared?
                            
                                Python: Make class iterable
                            
                                Is it possible to save datetime to DynamoDB?
                            
                                pip3 "TypeError: 'module' object is not callable" after update
                            
                                Python: Replace with regex
                            
                                Mocking out methods on any instance of a python class
                            
                                python; modifying list inside a function
                            
                                DynamoDB : The provided key element does not match the schema
                            
                                Java abstract/interface design in Python
                            
                                WARNING Not Found: /favicon.ico
                            
                                what is the difference between os.open and os.fdopen in python
                            
                                What is the difference in python attributes with underscore in front and back [duplicate]
                            
                                How to get back name of the enum element?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With