I am struggling to convert a comma separated list into a multi column (7) data-frame. <pre class="prettyprint"><code>print (type(mylist)) <type 'list'> Print(mylist) ['AN,2__AAS000,26,20150826113000,-283.000,20150826120000,-283.000', 'AN,2__AE000,26,20150826113000,0.000,20150826120000,0.000',......... </code></pre> The following creates a frame of a single column: <pre class="prettyprint"><code>df = pd.DataFrame(mylist) </code></pre> I have reviewed the inbuilt csv functionality for Pandas, however my csv data is held in a list. How can I simply covert the list into a 7 column data-frame. Thanks in advance.

You need to split each string in your list: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame([sub.split(",") for sub in l]) print(df) </code></pre> Output: <pre class="prettyprint"><code> 0 1 2 3 4 5 6 0 AN 2__AS000 26 20150826113000 -283.000 20150826120000 -283.000 1 AN 2__A000 26 20150826113000 0.000 20150826120000 0.000 2 AN 2__AE000 26 20150826113000 -269.000 20150826120000 -269.000 3 AN 2__AE000 26 20150826113000 -255.000 20150826120000 -255.000 4 AN 2__AE00 26 20150826113000 -254.000 20150826120000 -254.000 </code></pre> If you know how many lines to skip in your csv you can do it all with read_csv using <code>skiprows=lines_of_metadata</code>: <pre class="prettyprint"><code>import pandas as pd df = pd.read_csv("in.csv",skiprows=3,header=None) print(df) </code></pre> Or if each line of the metadata starts with a certain character you can use comment: <pre class="prettyprint"><code>df = pd.read_csv("in.csv",header=None,comment="#") </code></pre> If you need to specify more then one character you can combine <code>itertools.takewhile</code> which will drop lines starting with <code>xxx</code>: <pre class="prettyprint"><code>import pandas as pd from itertools import dropwhile import csv with open("in.csv") as f: f = dropwhile(lambda x: x.startswith("#!!"), f) r = csv.reader(f) df = pd.DataFrame().from_records(r) </code></pre> Using your input data adding some lines starting with #!!: <pre class="prettyprint"><code>#!! various #!! metadata #!! lines AN,2__AS000,26,20150826113000,-283.000,20150826120000,-283.000 AN,2__A000,26,20150826113000,0.000,20150826120000,0.000 AN,2__AE000,26,20150826113000,-269.000,20150826120000,-269.000 AN,2__AE000,26,20150826113000,-255.000,20150826120000,-255.000 AN,2__AE00,26,20150826113000,-254.000,20150826120000,-254.000 </code></pre> Outputs: <pre class="prettyprint"><code> 0 1 2 3 4 5 6 0 AN 2__AS000 26 20150826113000 -283.000 20150826120000 -283.000 1 AN 2__A000 26 20150826113000 0.000 20150826120000 0.000 2 AN 2__AE000 26 20150826113000 -269.000 20150826120000 -269.000 3 AN 2__AE000 26 20150826113000 -255.000 20150826120000 -255.000 4 AN 2__AE00 26 20150826113000 -254.000 20150826120000 -254.000 </code></pre>

Python convert comma separated list to pandas dataframe

Tags:

I am struggling to convert a comma separated list into a multi column (7) data-frame.

print (type(mylist))  <type 'list'> Print(mylist)   ['AN,2__AAS000,26,20150826113000,-283.000,20150826120000,-283.000',         'AN,2__AE000,26,20150826113000,0.000,20150826120000,0.000',.........

The following creates a frame of a single column:

df = pd.DataFrame(mylist)

I have reviewed the inbuilt csv functionality for Pandas, however my csv data is held in a list. How can I simply covert the list into a 7 column data-frame.

Thanks in advance.

719

asked Aug 26 '15 10:08

user636322

1 Answers

You need to split each string in your list:

import  pandas as pd  df = pd.DataFrame([sub.split(",") for sub in l]) print(df)

Output:

   0         1   2               3         4               5         6 0  AN  2__AS000  26  20150826113000  -283.000  20150826120000  -283.000 1  AN   2__A000  26  20150826113000     0.000  20150826120000     0.000 2  AN  2__AE000  26  20150826113000  -269.000  20150826120000  -269.000 3  AN  2__AE000  26  20150826113000  -255.000  20150826120000  -255.000 4  AN   2__AE00  26  20150826113000  -254.000  20150826120000  -254.000

If you know how many lines to skip in your csv you can do it all with read_csv using skiprows=lines_of_metadata:

import  pandas as pd  df = pd.read_csv("in.csv",skiprows=3,header=None) print(df)

Or if each line of the metadata starts with a certain character you can use comment:

df = pd.read_csv("in.csv",header=None,comment="#")

If you need to specify more then one character you can combine itertools.takewhile which will drop lines starting with xxx:

import pandas as pd from itertools import dropwhile import csv with open("in.csv") as f:     f = dropwhile(lambda x: x.startswith("#!!"), f)     r = csv.reader(f)     df = pd.DataFrame().from_records(r)

Using your input data adding some lines starting with #!!:

#!! various #!! metadata #!! lines AN,2__AS000,26,20150826113000,-283.000,20150826120000,-283.000 AN,2__A000,26,20150826113000,0.000,20150826120000,0.000 AN,2__AE000,26,20150826113000,-269.000,20150826120000,-269.000 AN,2__AE000,26,20150826113000,-255.000,20150826120000,-255.000 AN,2__AE00,26,20150826113000,-254.000,20150826120000,-254.000

Outputs:

    0         1   2               3         4               5         6 0  AN  2__AS000  26  20150826113000  -283.000  20150826120000  -283.000 1  AN   2__A000  26  20150826113000     0.000  20150826120000     0.000 2  AN  2__AE000  26  20150826113000  -269.000  20150826120000  -269.000 3  AN  2__AE000  26  20150826113000  -255.000  20150826120000  -255.000 4  AN   2__AE00  26  20150826113000  -254.000  20150826120000  -254.000

116

answered Oct 21 '22 17:10

Padraic Cunningham

Related questions
                            
                                "viewWillTransitionToSize:" not called in iOS 9 when the view controller is presented modally
                            
                                Move the cursor in a C program
                            
                                Should constructor initialize all the data members of the class?
                            
                                Build and run an app on simulator using xcodebuild
                            
                                Flexbox column-reverse in Firefox, Edge and IE
                            
                                Alternative binding syntax in Vue.js
                            
                                Drawing circles on image with Matplotlib and NumPy
                            
                                How to build React Native iOS app, get an .app file and deploy to device?
                            
                                How does CMake specify "Platform Toolset" for a Visual Studio 2015 project?
                            
                                SQuirreL SQL client does not show Json data
                            
                                Find if an object is subset of another object in javascript
                            
                                WebRTC pause and resume stream

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With