I want to filter out some rows with one of DataFrame's column which data is in a list.
df[df['column'].isin(mylist)]
But I found that it's case sensitive. Is there any method using ".isin()" with case insensitive?
str. contains has a case parameter that is True by default. Set it to False to do a case insensitive match.
pandas. DataFrame. merge (similar to a SQL join) is case sensitive, as are most Python functions.
contains() function is used to test if pattern or regex is contained within a string of a Series or Index. The function returns boolean Series or Index based on whether a given pattern or regex is contained within a string of a Series or Index.
lower() . Converts all characters to lowercase.
One way would be by comparing the lower or upper case of the Series with the same for the list
df[df['column'].str.lower().isin([x.lower() for x in mylist])]
The advantage here is that we are not saving any changes to the original df or the list making the operation more efficient
Consider this dummy df:
Color Val
0 Green 1
1 Green 1
2 Red 2
3 Red 2
4 Blue 3
5 Blue 3
For the list l:
l = ['green', 'BLUE']
You can use isin()
df[df['Color'].str.lower().isin([x.lower() for x in l])]
You get
Color Val
0 Green 1
1 Green 1
4 Blue 3
5 Blue 3
I prefer to use the general .apply
myset = set([s.lower() for s in mylist])
df[df['column'].apply(lambda v: v.lower() in myset)]
A lookup in a set
is faster than a lookup in a list
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With