Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting the minimum or maximum of two similar columns in Scala

I just started learning Scala and I'm trying to figure out a way to get the min of two or multiple Columns of the same type in a DataFrame. I have the following code which gives me the min and max of a Column individually.

inputDF.select(min($"dropoff_longitude")).show
inputDF.select(max($"pickup_longitude")).show

How do I get the min of both the Columns, dropoff_longitude and pickup_longitude. I did it like this

scala.math.min(
   inputDF.select(min($"pickup_longitude")).head.getFloat(0),
   inputDF.select(min($"dropoff_longitude")).head.getFloat(0)
)

Is there a better way to do this?

Thank you

like image 610
Aditya Vikas Devarapalli Avatar asked Dec 08 '22 18:12

Aditya Vikas Devarapalli


1 Answers

You can use least and greatest Spark SQL functions in select expressions for this purpose. In your case it will look like this:

import org.apache.spark.sql.functions._

val minLongitude =
    df.select(least($"pickup_longitude", $"dropoff_longitude") as "least_longitude")
      .agg(min($"least_longitude"))
      .head.getFloat(0)
like image 103
Sergiy Sokolenko Avatar answered Dec 21 '22 23:12

Sergiy Sokolenko