I'm extracting data from MySQL/MariaDB and during creation of Dataset, an error occurs with the data types
Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot up cast
AMOUNT
from decimal(30,6) to decimal(38,18) as it may truncate The type path of the target object is: - field (class: "org.apache.spark.sql.types.Decimal", name: "AMOUNT") - root class: "com.misp.spark.Deal" You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object;
Case class is defined like this
case class
(
AMOUNT: Decimal
)
Anyone know how to fix it and not touch the database?
Building on @user2737635's answer, you can use a foldLeft
rather than foreach
to avoid defining your dataset as a var
and redefining it:
//first read data to dataframe with any way suitable for you
val df: DataFrame = ???
val dfSchema = df.schema
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.DecimalType
dfSchema.foldLeft(df){
(dataframe, field) => field.dataType match {
case t: DecimalType if t != DecimalType(38, 18) => dataframe.withColumn(field.name, col(field.name).cast(DecimalType(38, 18)))
case _ => dataframe
}
}.as[YourCaseClassWithBigDecimal]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With