Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove rows not .isin('X') [duplicate]

Sorry just getting into Pandas, this seems like it should be a very straight forward question. How can I use the isin('X') to remove rows that are in the list X? In R I would write !which(a %in% b).

like image 942
DrewH Avatar asked Dec 27 '12 15:12

DrewH


People also ask

How can I get the rows of dataframe1 which are not in dataframe2?

First, we need to modify the original DataFrame to add the row with data [3, 10]. Perform a left-join, eliminating duplicates in df2 so that each row of df1 joins with exactly 1 row of df2 . Use the parameter indicator to return an extra column indicating which table the row was from.

What does the ISIN () function do in pandas?

Pandas DataFrame isin() Method The isin() method checks if the Dataframe contains the specified value(s). It returns a DataFrame similar to the original DataFrame, but the original values have been replaced with True if the value was one of the specified values, otherwise False .

How do you delete unique rows in pandas?

drop_duplicates() function is used to get the unique values (rows) of the dataframe in python pandas. The above drop_duplicates() function removes all the duplicate rows and returns only unique rows.

Why do I have to delete rows instead of just content?

The reason you need to do this instead of pressing the “delete” button on your computer is that it will delete the rows rather than just the content. Once you are done you will notice that all your remaining rows are unique values. › Functions vs. Formulas in Microsoft Excel: What’s the Difference?

How to delete or drop a row in Python pandas?

Dropping a row in pandas is achieved by using .drop () function. Lets see example of each. Drop Rows with Duplicate in pandas. Delete or Drop rows with condition in python pandas using drop () function. Drop rows by index / position in pandas.

How to set the Isin method of a Dataframe to false?

All you have to do is create a subset of your dataframe where the isin method evaluates to False: In [1]: df = pd.DataFrame ( [ [1,2], [3,4]], index= ['A','B']) In [2]: df Out [2]: 0 1 A 1 2 B 3 4 In [3]: L = ['A'] In [4]: df.select (lambda x: x in L) Out [4]: 0 1 A 1 2

How do I retain rows where at least one column is true?

From this, to retain rows where at least one column is True, we can use any along the first axis: Note that if you want to search every column, you'd just omit the column selection step and do Similarly, to retain rows where ALL columns are True, use all in the same manner as before.


2 Answers

You have many options. Collating some of the answers above and the accepted answer from this post you can do:
1. df[-df["column"].isin(["value"])]
2. df[~df["column"].isin(["value"])]
3. df[df["column"].isin(["value"]) == False]
4. df[np.logical_not(df["column"].isin(["value"]))]

Note: for option 4 for you'll need to import numpy as np

Update: You can also use the .query method for this too. This allows for method chaining:
5. df.query("column not in @values").
where values is a list of the values that you don't want to include.

like image 145
Jonny Brooks Avatar answered Sep 22 '22 23:09

Jonny Brooks


You can use numpy.logical_not to invert the boolean array returned by isin:

In [63]: s = pd.Series(np.arange(10.0))  In [64]: x = range(4, 8)  In [65]: mask = np.logical_not(s.isin(x))  In [66]: s[mask] Out[66]:  0    0 1    1 2    2 3    3 8    8 9    9 

As given in the comment by Wes McKinney you can also use

s[~s.isin(x)] 
like image 39
bmu Avatar answered Sep 18 '22 23:09

bmu