Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas usecols all except last

Tags:

python

pandas

I have a csv file, is it possible to have usecols take all columns except the last one when utilizing read_csv without listing every column needed.

For example, if I have a 13 column file, I can do usecols=[0,1,...,10,11]. Doing usecols=[:-1] will give me syntax error?

Is there another alternative? I'm using pandas 0.17

like image 256
Leb Avatar asked Oct 29 '15 21:10

Leb


People also ask

How do I read certain columns in pandas?

This can be done with the help of the pandas. read_csv() method. We will pass the first parameter as the CSV file and the second parameter the list of specific columns in the keyword usecols. It will return the data of the CSV file of specific columns.

What does Usecols mean in Python?

usecols is supposed to provide a filter before reading the whole DataFrame into memory; if used properly, there should never be a need to delete columns after reading.

How do I skip a column in pandas?

We can exclude one column from the pandas dataframe by using the loc function. This function removes the column based on the location. Here we will be using the loc() function with the given data frame to exclude columns with name,city, and cost in python.


2 Answers

Starting from version 0.20 the usecols method in pandas accepts a callable filter, i.e. a lambda expression. Hence if you know the name of the column you want to skip you can do as follows:

columns_to_skip = ['foo','bar']
df = pd.read_csv(file, usecols=lambda x: x not in columns_to_skip )

Here's the documentation reference.

like image 190
gibbone Avatar answered Sep 28 '22 13:09

gibbone


You can just read a single line using nrows=1 to get the cols and then re-read in the full csv skipping the last col by slicing the column array from the first read:

cols = pd.read_csv(file, nrows=1).columns
df = pd.read_csv(file, usecols=cols[:-1])
like image 42
EdChum Avatar answered Sep 28 '22 13:09

EdChum