Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas drop first columns after csv read

Is there a way to reference an object within the line of the instantiation ?

See the following example : I wanted to drop the first column (by index) of a csv file just after reading it (usually pd.to_csv outputs the index as first col) :

df = pd.read_csv(csvfile).drop(self.columns[[0]], axis=1)

I understand self should be placed in the object context but it here describes what I intent to do.

(Of course, doing this operation in two separate lines works perfectly.)

like image 355
Ti me Avatar asked Mar 30 '18 10:03

Ti me


2 Answers

One way is to use pd.DataFrame.iloc:

import pandas as pd
from io import StringIO

mystr = StringIO("""col1,col2,col3
a,b,c
d,e,f
g,h,i
""")

df = pd.read_csv(mystr).iloc[:, 1:]

#   col2 col3
# 0    b    c
# 1    e    f
# 2    h    i
like image 176
jpp Avatar answered Oct 12 '22 12:10

jpp


Assuming you know the total number of columns in the dataset, and the indexes you want to remove -

a = range(3)
a.remove(1)
df = pd.read_csv('test.csv', usecols = a)

Here 3 is the total number of columns, and I wanted to remove 2nd column. You can directly write index of columns to use

like image 24
Aritesh Avatar answered Oct 12 '22 13:10

Aritesh