Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting rows based on multiple conditions Python Pandas

Tags:

I want to delete rows when a few conditions are met:

For instance, a random DataFrame is generated:

import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(10, 4), columns=['one', 'two', 'three', 'four']) print df 

one instance of table is shown as below:

        one       two     three      four 0 -0.225730 -1.376075  0.187749  0.763307 1  0.031392  0.752496 -1.504769 -1.247581 2 -0.442992 -0.323782 -0.710859 -0.502574 3 -0.948055 -0.224910 -1.337001  3.328741 4  1.879985 -0.968238  1.229118 -1.044477 5  0.440025 -0.809856 -0.336522  0.787792 6  1.499040  0.195022  0.387194  0.952725 7 -0.923592 -1.394025 -0.623201 -0.738013 8 -1.775043 -1.279997  0.194206 -1.176260 9 -0.602815  1.183396 -2.712422 -0.377118 

I want to delete rows based on the conditions that:

Row with value of col 'one', 'two', or 'three' greater than 0; and value of col 'four' less than 0 should be deleted.

Then I tried to implement as follows:

df = df[df.one > 0 or df.two > 0 or df.three > 0 and df.four < 1] 

However, resulting in a error message as follow:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() 

Could someone help me on how to delete based on multiple conditions?

like image 484
fyr91 Avatar asked Mar 12 '15 18:03

fyr91


People also ask

How do I delete rows in pandas DataFrame based on multiple conditions?

Pandas provide data analysts a way to delete and filter data frame using dataframe. drop() method. We can use this method to drop such rows that do not satisfy the given conditions.

How do I delete rows from multiple conditions?

To remove rows of data from a dataframe based on multiple conditional statements. We use square brackets [ ] with the dataframe and put multiple conditional statements along with AND or OR operator inside it. This slices the dataframe and removes all the rows that do not satisfy the given conditions.

How do you drop rows in pandas based on multiple column values?

Use drop() method to delete rows based on column value in pandas DataFrame, as part of the data cleansing, you would be required to drop rows from the DataFrame when a column value matches with a static value or on another column value.


1 Answers

For reasons that aren't 100% clear to me, pandas plays nice with the bitwise logical operators | and &, but not the boolean ones or and and.

Try this instead:

df = df[(df.one > 0) | (df.two > 0) | (df.three > 0) & (df.four < 1)] 
like image 119
Brionius Avatar answered Oct 11 '22 12:10

Brionius