I have table formatted as follow : <pre class="prettyprint"><code>foo - bar - 10 2e-5 0.0 some information quz - baz - 4 1e-2 1 some other description in here </code></pre> When I open it with pandas doing : <pre class="prettyprint"><code>a = pd.read_table("file", header=None, sep=" ") </code></pre> It tells me : <pre class="prettyprint"><code>CParserError: Error tokenizing data. C error: Expected 9 fields in line 2, saw 12 </code></pre> What I'd basically like to have is something similar to the skiprows option which would allow me to do something like : <pre class="prettyprint"><code>a = pd.read_table("file", header=None, sep=" ", skipcolumns=[8:]) </code></pre> I'm aware that I could re-format this table with <code>awk</code>, but I'd like to known whether a Pandas solution exists or not. Thanks.

The <code>usecols</code> parameter allows you to select which columns to use: <pre class="prettyprint"><code>a = pd.read_table("file", header=None, sep=" ", usecols=range(8)) </code></pre> However, to accept irregular column counts you need to also use <code>engine='python'</code>.

Python Pandas : How to skip columns when reading a file?

Tags:

I have table formatted as follow :

foo - bar - 10 2e-5 0.0 some information quz - baz - 4 1e-2 1 some other description in here

When I open it with pandas doing :

a = pd.read_table("file", header=None, sep=" ")

It tells me :

CParserError: Error tokenizing data. C error: Expected 9 fields in line 2, saw 12

What I'd basically like to have is something similar to the skiprows option which would allow me to do something like :

a = pd.read_table("file", header=None, sep=" ", skipcolumns=[8:])

I'm aware that I could re-format this table with awk, but I'd like to known whether a Pandas solution exists or not.

Thanks.

803

asked Jun 23 '14 12:06

jrjc

1 Answers

The usecols parameter allows you to select which columns to use:

a = pd.read_table("file", header=None, sep=" ", usecols=range(8))

However, to accept irregular column counts you need to also use engine='python'.

answered Oct 21 '22 14:10

otus

Related questions
                            
                                How do I select a range of elements in Spark RDD?
                            
                                How to print types of unknown size like ino_t?
                            
                                Import certificate as PrivateKeyEntry
                            
                                AWS cloudformation: One big template file or many small ones?
                            
                                Customizing Json.NET serialization: turning object into array to avoid repetition of property names
                            
                                What is spec and spec_set
                            
                                Spark-submit ClassNotFound exception
                            
                                What is behavior: url(); property in css?
                            
                                Taking reliable screenshots of websites? Phantomjs and Casperjs both return empty screen shots on some websites
                            
                                MongoDB Aggregation Performance
                            
                                How does an interpreter interpret the code?
                            
                                Django custom annotation function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With