I want to filter a dataframe to find rows which do not contain the string 'site'.
I know how to filter for rows which do contain 'site' but have not been able to get the reverse working. Here is what I have so far:
def rbs(): #removes blocked sites
frame = fill_rate()
mask = frame[frame['Media'].str.contains('Site')==True]
frame = (frame != mask)
return frame
But this returns an error, of course.
Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .
Getting rows where values do not contain substring in Pandas DataFrame. To get rows where values do not contain a substring, use str. contains(~) with the negation operator ~ .
Just do frame[~frame['Media'].str.contains('Site')]
The ~
negates the boolean condition
So your method becomes:
def rbs(): #removes blocked sites
frame = fill_rate()
return frame[~frame['Media'].str.contains('Site')]
EDIT
it looks like you have NaN
values judging by your errors so you have to filter these out first so your method becomes:
def rbs(): #removes blocked sites
frame = fill_rate()
frame = frame[frame['Media'].notnull()]
return frame[~frame['Media'].str.contains('Site')]
the notnull
will filter out the missing values
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With