Assuming I've the following pandas.Series:
import pandas as pd
s = pd.Series([1,3,5,True,6,8,'findme', False])
I can use the in
operator to find any of the integers or Booleans. Examples, the following all yield True:
1 in s
True in s
However, this fails when I do:
'findme' in s
My workaround is to use pandas.Series.str
or to first convert the Series to a list and then use the in
operator:
True in s.str.contains('findme')
s2 = s.tolist()
'findme' in s2
Any idea why I can't directly use the in
operator to find a string in a Series?
Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .
find() method is used to search a substring in each string present in a series. If the string is found, it returns the lowest index of its occurrence. If string is not found, it will return -1. Start and end points can also be passed to search a specific part of string for the passed character or substring.
contains() function is used to test if pattern or regex is contained within a string of a Series or Index. The function returns boolean Series or Index based on whether a given pattern or regex is contained within a string of a Series or Index.
Any idea why I can't directly use the in operator to find a string in a Series?
Think of a Series more like an ordered dictionary than a list-- membership testing in a Series is of the index (like keys in a dictionary), not of the values. You could access the values via under the .values
attribute:
>>> s = pd.Series([1,3,5,True,6,8,'findme', False])
>>> 7 in s
True
>>> 7 in s.values
False
>>> 'findme' in s
False
>>> 'findme' in s.values
True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With