Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pyspark: Replace all occurrences of a value with null in dataframe

I have a dataframe similar to below. I originally filled all null values with -1 to do my joins in Pyspark.

df = pd.DataFrame({'Number': ['1', '2', '-1', '-1'],
                   'Letter': ['A', '-1', 'B', 'A'],
                   'Value': [30, 30, 30, -1]})


pyspark_df = spark.createDataFrame(df)

+------+------+-----+
|Number|Letter|Value|
+------+------+-----+
|     1|     A|   30|
|     2|    -1|   30|
|    -1|     B|   30|
|    -1|     A|   -1|
+------+------+-----+

After processing the dataset, I need to replace all -1 back to null values.

+------+------+-----+
|Number|Letter|Value|
+------+------+-----+
|     1|     A|   30|
|     2|  null|   30|
|  null|     B|   30|
|  null|     A| null|
+------+------+-----+

What's the easiest way to do this?

like image 786
sanjayr Avatar asked Jun 29 '20 15:06

sanjayr


3 Answers

Another way to do this in a less verbose manner is to use replace.

pyspark_df.replace(-1,None).replace('-1',None).show()
like image 199
murtihash Avatar answered Oct 19 '22 00:10

murtihash


when+otherwise shall do the trick:

import pyspark.sql.functions as F

pyspark_df.select([F.when(F.col(i).cast("Integer") <0 , None).otherwise(F.col(i)).alias(i)
                  for i in df.columns]).show()

+------+------+-----+
|Number|Letter|Value|
+------+------+-----+
|     1|     A|   30|
|     2|  null|   30|
|  null|     B|   30|
|  null|     A| null|
+------+------+-----+
like image 28
anky Avatar answered Oct 18 '22 22:10

anky


You can scan all columns and replace -1's with None:

import pyspark.sql.functions as F

for x in pyspark_df.columns:
    pyspark_df = pyspark_df.withColumn(x, F.when(F.col(x)==-1, F.lit(None)).otherwise(F.col(x)))

pyspark_df.show()

Output:

+------+------+-----+
|Number|Letter|Value|
+------+------+-----+
|     1|     A|   30|
|     2|  null|   30|
|  null|     B|   30|
|  null|     A| null|
+------+------+-----+
like image 3
Ala Tarighati Avatar answered Oct 18 '22 22:10

Ala Tarighati