In SQL, we can for example, do select * from table where col1 not in ('A','B');
I was wondering if there is a PySpark equivalent for this. I was able to find the isin
function for SQL like IN
clause, but nothing for NOT IN
.
I just had the same issue and found solution. If you want to negate any condition (in pySpark represented as Column
class) there is negation operator ~
, for example:
df.where(~df.flag.isin(1, 2, 3)) # records with flag NOT IN (1, 2, 3)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With