This can be a really simple question. I am using Spark 1.6 with scala
var DF=hivecontext.sql("select name from myTable")
val name_max_len =DF.agg(max(length($"name"))) // did not work
println(name_max_len)
How can I get max length?
char_length(expr) - Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.
Spark SQL provides a length() function that takes the DataFrame column type as a parameter and returns the number of characters (including trailing spaces) in a string. This function can be used to filter() the DataFrame rows by the length of a column. If the input column is Binary, it returns the number of bytes.
LongType : Represents 8-byte signed integer numbers. The range of numbers is from -9223372036854775808 to 9223372036854775807 . FloatType : Represents 4-byte single-precision floating point numbers. DoubleType : Represents 8-byte double-precision floating point numbers.
You should collect result:
import org.apache.spark.sql.functions.max
val df = Seq("foo", "bar", "foobar").toDF("name")
df.agg(max(length($"name"))).as[Int].first
// res0: Int = 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With