Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter a pandas DataFrame according to a list of tuples?

How to filter dataframe from a set of tuples, so that the pairing is the same? I need a more elegant way of writing. Im trying not to use merge because it will make it less efficient.

So I have a list of tuple called tup_list: [('118', '35'), ('35', '35'), ('118', '202') Assuming the first element in each tuple is A, and the second is B, I am trying to filter my dataframe according to this tup_list, where the pairing needs to be the same.

Original dataframe:

A   B
118 35
118 40
35  202
118 1
35  35

After filtering according to the tup_list, the new dataframe should be:

A   B
118 35
35  35

Only exact pairings should be returned.

Currently Im using df= df.merge(tup_list, on=['A','B'], how='inner'). But is not very efficient as my actual data is larger.

Please advise on more efficient way of writing.

like image 991
R_abcdefg Avatar asked Dec 21 '18 07:12

R_abcdefg


People also ask

How do you filter a DataFrame in multiple conditions?

Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.

How do I filter specific rows from a DataFrame pandas?

You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows. You can also write the above statement with a variable.


1 Answers

use boolean indexing:

tup_list = [(118, 35), (35, 35), (118, 202)]
df[pd.Series(list(zip(df['A'], df['B']))).isin(tup_list)]

    A   B
0   118 35
4   35  35

list(zip(df['A'], df['B'])) turns your two columns into a list of tuples:

[(118, 35), (118, 40), (35, 202), (118, 1), (35, 35)]

which you are turning into a series and using isin to return a boolean:

0     True
1    False
2    False
3    False
4     True
dtype: bool

Which can be used in boolean indexing

like image 129
It_is_Chris Avatar answered Sep 23 '22 19:09

It_is_Chris