I am working Spark v1.6. I have the following two DataFrames and I want to convert the null to 0 in my left outer join ResultSet. Any suggestions?
val x: Array[Int] = Array(1,2,3)
val df_sample_x = sc.parallelize(x).toDF("x")
val y: Array[Int] = Array(3,4,5)
val df_sample_y = sc.parallelize(y).toDF("y")
val df_sample_join = df_sample_x
.join(df_sample_y,df_sample_x("x") === df_sample_y("y"),"left_outer")
scala> df_sample_join.show
x | y
--------
1 | null
2 | null
3 | 3
But I want the resultset to be displayed as.
-----------------------------------------------
scala> df_sample_join.show
x | y
--------
1 | 0
2 | 0
3 | 3
Just use na.fill
:
df.na.fill(0, Seq("y"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With