Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Data writing in Delta format

Spark Version: 3.2.1 Delta version: 1.2.1 (tried 2.0 version as well)

While I am trying to run the getting started code to try out "delta".

from pyspark.sql import SparkSession
from delta import *
builder = SparkSession.builder.appName("MyApp") \
    .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
    .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")

spark = configure_spark_with_delta_pip(builder).getOrCreate()
data = spark.range(0, 5)
data.write.format("delta").save("/tmp/delta-table")

I am getting below error: "name": "Py4JJavaError", "message": "An error occurred while calling o201.showString.\n: org.apache.spark.SparkException: Cannot find catalog plugin class for catalog 'spark_catalog'

Can anyone please help me understand the issue to resolve it? Thanks in Advance.

like image 957
Mohan Avatar asked Feb 28 '26 07:02

Mohan


1 Answers

Not sure which environment and mode are you using, but in general you need to add your jar by using the config spark.jars.packages because delta lake jar is not in Spark default jar. For example .config("spark.jars.packages", "io.delta:delta-core_2.12:1.2.0")

like image 176
Jonathan Avatar answered Mar 02 '26 15:03

Jonathan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!