I just started learning Scala and I'm trying to figure out a way to get the min of two or multiple Columns of the same type in a DataFrame. I have the following code which gives me the min and max of a Column individually. 
inputDF.select(min($"dropoff_longitude")).show
inputDF.select(max($"pickup_longitude")).show
How do I get the min of both the Columns, dropoff_longitude and  pickup_longitude. I did it like this
scala.math.min(
   inputDF.select(min($"pickup_longitude")).head.getFloat(0),
   inputDF.select(min($"dropoff_longitude")).head.getFloat(0)
)
Is there a better way to do this?
Thank you
You can use least and greatest Spark SQL functions in select expressions for this purpose. In your case it will look like this:
import org.apache.spark.sql.functions._
val minLongitude =
    df.select(least($"pickup_longitude", $"dropoff_longitude") as "least_longitude")
      .agg(min($"least_longitude"))
      .head.getFloat(0)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With