I am generally confused as to if I want to filter a dataframe column items with something,
should isin
or .str.contains
or if "aa" in df["column"]
is used?
Kindly tell me which of them are used in different cases ?
Use isin
if you want to check the occurrence of one of multiple strings in the values of a Series:
import pandas as pd
things = pd.Series(['apple', 'banana', 'house', 'car'])
fruits = ['apple', 'banana', 'kiwi']
things.isin(fruits)
Output:
0 True
1 True
2 False
3 False
dtype: bool
.str.contains
does the same but only for one string and it also matches parts of strings.
things.str.contains('apple')
Output:
0 True
1 False
2 False
3 False
dtype: bool
things.str.contains('app')
Output:
0 True
1 False
2 False
3 False
dtype: bool
A in series
checks whether A
is in the index of the pd.Series:
"apple" in things
# Output: False
Our things
series has no 'apple' in its indices, it is quickly clear why:
> things
0 apple
1 banana
2 house
3 car
dtype: object
The first column describes the index, so we can check it:
0 in things
# Output: True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With