Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Querying Pandas DataFrame with column name that contains a space or using the drop method with a column name that contains a space

Tags:

python

pandas

I am looking to use pandas to drop rows based on the column name (contains a space) and the cell value. I have tried various ways to achieve this (drop and query methods) but it seems I'm failing due to the space in the name. Is there a way to query the data using the name that has a space in it or do I need to clean all spaces first?

data in form of a csv file

Date,"price","Sale Item"
2012-06-11,1600.20,item1
2012-06-12,1610.02,item2
2012-06-13,1618.07,item3
2012-06-14,1624.40,item4
2012-06-15,1626.15,item5
2012-06-16,1626.15,item6
2012-06-17,1626.15,item7

Attempt Examples

df.drop(['Sale Item'] != 'Item1')
df.drop('Sale Item' != 'Item1')
df.drop("'Sale Item'] != 'Item1'")

df.query('Sale Item' != 'Item1')
df.query(['Sale Item'] != 'Item1')
df.query("'Sale Item'] != 'Item1'")

Error received in most cases

ImportError: 'numexpr' not found. Cannot use engine='numexpr' for query/eval if 'numexpr' is not installed
like image 590
iNoob Avatar asked Oct 05 '15 15:10

iNoob


People also ask

How do you call a column name with space in pandas?

You can refer to column names that contain spaces or operators by surrounding them in backticks. This way you can also escape names that start with a digit, or those that are a Python keyword.

How do I use the drop method in pandas?

Pandas DataFrame drop() Method The drop() method removes the specified row or column. By specifying the column axis ( axis='columns' ), the drop() method removes the specified column. By specifying the row axis ( axis='index' ), the drop() method removes the specified row.

Which method is used to drop a column of DataFrame?

DataFrame - drop() function The drop() function is used to drop specified labels from rows or columns. Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names. When using a multi-index, labels on different levels can be removed by specifying the level.


1 Answers

If I understood correctly your issue, maybe you can just apply a filter like:

df = df[df['Sale Item'] != 'item1']

which returns:

         Date    price Sale Item
1  2012-06-12  1610.02     item2
2  2012-06-13  1618.07     item3
3  2012-06-14  1624.40     item4
4  2012-06-15  1626.15     item5
5  2012-06-16  1626.15     item6
6  2012-06-17  1626.15     item7
like image 177
Fabio Lamanna Avatar answered Oct 13 '22 01:10

Fabio Lamanna