I'm wondering if there is a more efficient way to use the str.contains() function in Pandas, to search for two partial strings at once. I want to search a given column in a dataframe for data that contains either "nt" or "nv". Right now, my code looks like this:
df[df['Behavior'].str.contains("nt", na=False)] df[df['Behavior'].str.contains("nv", na=False)]
And then I append one result to another. What I'd like to do is use a single line of code to search for any data that includes "nt" OR "nv" OR "nf." I've played around with some ways that I thought should work, including just sticking a pipe between terms, but all of these result in errors. I've checked the documentation, but I don't see this as an option. I get errors like this:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-113-1d11e906812c> in <module>() 3 4 ----> 5 soctol = f_recs[f_recs['Behavior'].str.contains("nt"|"nv", na=False)] 6 soctol TypeError: unsupported operand type(s) for |: 'str' and 'str'
Is there a fast way to do this? Thanks for any help, I am a beginner but am LOVING pandas for data wrangling.
contains() function is used to test if pattern or regex is contained within a string of a Series or Index. The function returns boolean Series or Index based on whether a given pattern or regex is contained within a string of a Series or Index. Parameter : pat : Character sequence or regular expression.
You can create a DataFrame from multiple Series objects by adding each series as a columns. By using concat() method you can merge multiple series together into DataFrame.
Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .
Python string __contains__() is an instance method and returns boolean value True or False depending on whether the string object contains the specified string object or not. Note that the Python string contains() method is case sensitive.
They should be one regular expression, and should be in one string:
"nt|nv" # rather than "nt" | " nv" f_recs[f_recs['Behavior'].str.contains("nt|nv", na=False)]
Python doesn't let you use the or (|
) operator on strings:
In [1]: "nt" | "nv" TypeError: unsupported operand type(s) for |: 'str' and 'str'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With