When I try to collect data from Spark dataframe, I get an error stating
"java.lang.IllegalArgumentException: requirement failed: Decimal precision 39 exceeds max precision 38".
All the data which is in Spark dataframe is from Oracle database, where I believe decimal precision is <38. Is there any way I can achieve this without modifying the data?
# Load required table into memory from Oracle database
df <- loadDF(sqlContext, source = "jdbc", url = "jdbc:oracle:thin:usr/[email protected]:1521" , dbtable = "TBL_NM")
RawData <- df %>%
filter(DT_Column > DATE(‘2015-01-01’))
RawData <- as.data.frame(RawData)
Gives error
Below is the stacktrace:
WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 10...***, executor 0): java.lang.IllegalArgumentException: requirement failed: Decimal precision 39 exceeds max precision 38 at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:113) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:426) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3$$anonfun$9.apply(JdbcUtils.scala:337) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3$$anonfun$9.apply(JdbcUtils.scala:337) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$nullSafeConvert(JdbcUtils.scala:438) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3.apply(JdbcUtils.scala:337) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$makeGetter$3.apply(JdbcUtils.scala:335) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:286) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:268) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Please suggest any solution. Thank you.
Ran across this with AWS Glue and Postgres. There was a bug in Spark 2.1.0 that fixed it for most people, but someone posted a workaround in the comments about using a customSchema option.
I had a similar issue with AWS Glue and Spark SQL: I was calculating a currency amount so the result was a float. Glue threw the error Decimal precision 1 exceeds max precision -1
even though the Glue Data Catalog defined the column as a decimal. Took a page from the above customSchema solution by explicitly casting the column as NUMERIC(10,2) and Spark stopped complaining.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With