Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete a column from a data frame with pandas?

I read my data

import pandas as pd df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t') print df 

and get:

          id    text 0    361.273    text1... 1    374.350    text2... 2    374.350    text3... 

How can I delete the id column from the above data frame?. I tried the following:

import pandas as pd df = pd.read_csv('/path/file.tsv', header=0, delimiter='\t') print df.drop('id', 1) 

But it raises this exception:

ValueError: labels ['id'] not contained in axis 
like image 667
newWithPython Avatar asked Jan 20 '15 00:01

newWithPython


People also ask

How do I remove a specific column from a DataFrame in Python?

To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.

How do I remove a column from a DataFrame list?

DataFrame. drop() method removes the column/columns from the DataFrame, by default it doesn't remove on the existing DataFrame instead it returns a new DataFrame after dropping the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

Which function is used to delete a column from a DataFrame in pandas?

The 'pop' function is used to delete a specific column.

Which method is used to delete column in DataFrame?

drop() pandas. DataFrame. drop method is used to delete the specified labels from either rows or columns.


2 Answers

df.drop(colname, axis=1) (or del df[colname]) is the correct method to use to delete a column.

If a ValueError is raised, it means the column name is not exactly what you think it is.

Check df.columns to see what Pandas thinks are the names of the columns.

like image 159
unutbu Avatar answered Oct 03 '22 01:10

unutbu


The best way to delete a column in pandas is to use drop:

df = df.drop('column_name', axis=1) 

where 1 is the axis number (0 for rows and 1 for columns.)

To delete the column without having to reassign df you can do:

df.drop('column_name', axis=1, inplace=True) 

Finally, to drop by column number instead of by column label, try this. To delete, e.g. the 1st, 2nd and 4th columns:

df.drop(df.columns[[0, 1, 3]], axis=1)  # df.columns is zero-based pd.Index  


Exceptions:

If a wrong column number or label is requested an error will be thrown. To check the number of columns use df.shape[1] or len(df.columns.values) and to check the column labels use df.columns.values.

An exception would be raised answer was based on @LondonRob's answer and left here to help future visitors of this page.

like image 30
borgr Avatar answered Oct 03 '22 01:10

borgr