I have a python virtual environment in which I have added pyspark v3.4.1. I have run the following command to install the iceberg package-
spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.3.0\
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \
--conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
--conf spark.sql.catalog.spark_catalog.type=hive \
--conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.local.type=hadoop \
--conf spark.sql.catalog.local.warehouse=$PWD/warehouse
It raises the following error-
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/expressions/AnsiCast
I am able to load the packages if I remove the extensions config, but am unable to use the MERGE INTO function in my SQL queries. What could be causing this error?
You must use :
org.apache.iceberg:iceberg-spark-extensions-3.4_2.12
org.apache.iceberg:iceberg-spark-runtime-3.4_2.12
instead of:
org.apache.iceberg:iceberg-spark-runtime-3.2_2.12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With