I just started learning Scala and I'm trying to figure out a way to get the min
of two or multiple Columns
of the same type in a DataFrame
. I have the following code which gives me the min
and max
of a Column
individually.
inputDF.select(min($"dropoff_longitude")).show
inputDF.select(max($"pickup_longitude")).show
How do I get the min
of both the Columns
, dropoff_longitude
and pickup_longitude
. I did it like this
scala.math.min(
inputDF.select(min($"pickup_longitude")).head.getFloat(0),
inputDF.select(min($"dropoff_longitude")).head.getFloat(0)
)
Is there a better way to do this?
Thank you
You can use least
and greatest
Spark SQL functions in select expressions for this purpose. In your case it will look like this:
import org.apache.spark.sql.functions._
val minLongitude =
df.select(least($"pickup_longitude", $"dropoff_longitude") as "least_longitude")
.agg(min($"least_longitude"))
.head.getFloat(0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With