How to filter dataframe from a set of tuples, so that the pairing is the same? I need a more elegant way of writing. Im trying not to use merge because it will make it less efficient.
So I have a list of tuple called tup_list:
[('118', '35'), ('35', '35'), ('118', '202')
Assuming the first element in each tuple is A, and the second is B, I am trying to filter my dataframe according to this tup_list, where the pairing needs to be the same.
Original dataframe:
A B
118 35
118 40
35 202
118 1
35 35
After filtering according to the tup_list, the new dataframe should be:
A B
118 35
35 35
Only exact pairings should be returned.
Currently Im using df= df.merge(tup_list, on=['A','B'], how='inner'). But is not very efficient as my actual data is larger.
Please advise on more efficient way of writing.
Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.
You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows. You can also write the above statement with a variable.
use boolean indexing:
tup_list = [(118, 35), (35, 35), (118, 202)]
df[pd.Series(list(zip(df['A'], df['B']))).isin(tup_list)]
A B
0 118 35
4 35 35
list(zip(df['A'], df['B']))
turns your two columns into a list of tuples:
[(118, 35), (118, 40), (35, 202), (118, 1), (35, 35)]
which you are turning into a series and using isin
to return a boolean:
0 True
1 False
2 False
3 False
4 True
dtype: bool
Which can be used in boolean indexing
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With