pyspark sql dataframe keep only null [duplicate]

Question

I have a sql dataframe df and there is a column user_id, how do I filter the dataframe and keep only user_id is actually null for further analysis? From the pyspark module page here, one can drop na rows easily but did not say how to do the opposite.

Tried df.filter(df.user_id == 'null'), but the result is 0 column. Maybe it is looking for a string "null". Also df.filter(df.user_id == null) won't work as it is looking for a variable named 'null'

David · Accepted Answer

Try

df.filter(df.user_id.isNull())

pyspark sql dataframe keep only null [duplicate]

Tags:

sql

null

dataframe

apache-spark

pyspark

hdy

1 Answers

David

Recent Activity

Donate For Us

pyspark sql dataframe keep only null [duplicate]

Tags:

sql

null

dataframe

apache-spark

pyspark

hdy

1 Answers

David

Related questions

Recent Activity

Donate For Us