Using str.contains on pandas dataframe [duplicate]

Question

This pandas python code generates the error message,

"TypeError: bad operand type for unary ~: 'float'"

I have no idea why because I'm trying to manipulate a str object

df_Anomalous_Vendor_Reasons[~df_Anomalous_Vendor_Reasons['V'].str.contains("File*|registry*")] #sorts, leaving only cases where reason is NOT File or Registry

Anybody got any ideas?

Josh · Accepted Answer

Credit to Davtho1983 comment above, I thought I'd add color to the comment for clarity.

For anyone stumbling on this later with the same error (like me). It's a very simple fix. The documentation from pandas shows

Series.str.contains(pat, case=True, flags=0, na=nan, regex=True)

What's happening is the contains() method isn't being applied to na values in the DataFrame, they will remain na. You just need to fill na values with Boolean values so you may use the invert operator ~ .

With the example above one should use

df_Anomalous_Vendor_Reasons[~df_Anomalous_Vendor_Reasons['V'].str.contains("File*|registry*", na=False)]

Of course one should choose False or True for the na argument based on intended logic. Whichever Boolean value you choose for filling na will be inverted.

Using str.contains on pandas dataframe [duplicate]

Tags:

python

arrays

string

pandas

excel

Davtho1983

1 Answers

Josh

Recent Activity

Donate For Us

Using str.contains on pandas dataframe [duplicate]

Tags:

python

arrays

string

pandas

excel

Davtho1983

1 Answers

Josh

Related questions

Recent Activity

Donate For Us