Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas loc - filter for list of values [duplicate]

This should be incredibly easy, but I can't get it to work.

I want to filter my dataset on two or more values.

#this works, when I filter for one value df.loc[df['channel'] == 'sale']   #if I have to filter, two separate columns, I can do this df.loc[(df['channel'] == 'sale')&(df['type']=='A')]   #but what if I want to filter one column by more than one value? df.loc[df['channel'] == ('sale','fullprice')]  

Would this have to be an OR statement? I can do something like in SQL using in?

like image 771
jeangelj Avatar asked Aug 21 '17 18:08

jeangelj


People also ask

How do I get a list of all the duplicate items using pandas in Python?

Pandas DataFrame. duplicated() function is used to get/find/select a list of all duplicate rows(all or selected columns) from pandas. Duplicate rows means, having multiple rows on all columns. Using this method you can get duplicate rows on selected multiple columns or all columns.

How do you filter a loc in Python?

to filter one column by multiple values. df. loc[df['channel']. apply(lambda x: x in ['sale','fullprice'])] would also work.

How do I check for duplicates in pandas?

The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.


1 Answers

There is a df.isin(values) method wich tests whether each element in the DataFrame is contained in values. So, as @MaxU wrote in the comment, you can use

df.loc[df['channel'].isin(['sale','fullprice'])] 

to filter one column by multiple values.

like image 133
taras Avatar answered Sep 28 '22 10:09

taras