Search for "does-not-contain" on a DataFrame in pandas

People also ask

How do you use not in filter?

How to Use “not in” operator in Filter, To filter for rows in a data frame that is not in a list of values, use the following basic syntax in dplyr. df %>% filter(! col_name %in% c('value1', 'value2', 'value3', ...)) df %>% filter(!

How do you check for missing values in pandas?

In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both function help in checking whether a value is NaN or not. These function can also be used in Pandas Series in order to find null values in a series.

You can use the invert (~) operator (which acts like a not for boolean data):

new_df = df[~df["col"].str.contains(word)]

, where new_df is the copy returned by RHS.

contains also accepts a regular expression...

If the above throws a ValueError, the reason is likely because you have mixed datatypes, so use na=False:

new_df = df[~df["col"].str.contains(word, na=False)]

Or,

new_df = df[df["col"].str.contains(word) == False]

I was having trouble with the not (~) symbol as well, so here's another way from another StackOverflow thread:

df[df["col"].str.contains('this|that')==False]

You can use Apply and Lambda :

df[df["col"].apply(lambda x: word not in x)]

Or if you want to define more complex rule, you can use AND:

df[df["col"].apply(lambda x: word_1 not in x and word_2 not in x)]

I hope the answers are already posted

I am adding the framework to find multiple words and negate those from dataFrame.

Here 'word1','word2','word3','word4' = list of patterns to search

df = DataFrame

column_a = A column name from from DataFrame df

values_to_remove = ['word1','word2','word3','word4'] 

pattern = '|'.join(values_to_remove)

result = df.loc[~df['column_a'].str.contains(pattern, case=False)]

I had to get rid of the NULL values before using the command recommended by Andy above. An example:

df = pd.DataFrame(index = [0, 1, 2], columns=['first', 'second', 'third'])
df.ix[:, 'first'] = 'myword'
df.ix[0, 'second'] = 'myword'
df.ix[2, 'second'] = 'myword'
df.ix[1, 'third'] = 'myword'
df

    first   second  third
0   myword  myword   NaN
1   myword  NaN      myword 
2   myword  myword   NaN

Now running the command:

~df["second"].str.contains(word)

I get the following error:

TypeError: bad operand type for unary ~: 'float'

I got rid of the NULL values using dropna() or fillna() first and retried the command with no problem.

Additional to nanselm2's answer, you can use 0 instead of False:

df["col"].str.contains(word)==0

Related questions
                            
                                What are Flask Blueprints, exactly?
                            
                                Purpose of #!/usr/bin/python3 shebang
                            
                                Which is better in python, del or delattr?
                            
                                Python script to copy text to clipboard [duplicate]
                            
                                Random row selection in Pandas dataframe
                            
                                How do I get the "id" after INSERT into MySQL database with Python?
                            
                                python generator "send" function purpose?
                            
                                hasNext in Python iterators?
                            
                                How can I map True/False to 1/0 in a Pandas DataFrame?
                            
                                Python - Check If Word Is In A String
                            
                                Python 'If not' syntax [duplicate]
                            
                                What is pip's `--no-cache-dir` good for?
                            
                                Why does Pycharm's inspector complain about "d = {}"?
                            
                                How to convert a NumPy array to PIL image applying matplotlib colormap
                            
                                How do you write tests for the argparse portion of a python module?
                            
                                import module from string variable
                            
                                Seeing escape characters when pressing the arrow keys in python shell
                            
                                How to plot in multiple subplots
                            
                                Case insensitive replace
                            
                                Logging uncaught exceptions in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Search for "does-not-contain" on a DataFrame in pandas

Tags:

python

contains

pandas

People also ask

Recent Activity

Donate For Us