Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Avro as a Built-in Data Source in Apache Spark 2.4

Tags:

apache-spark

I recently read this article and tried out the example but when I run

val usersDF = spark.read.format("avro")
                        .load("examples/src/main/resources/users.avro")

But this gives me an error when I try to run it.

Exception in thread "main" org.apache.spark.sql.AnalysisException: Failed to find data source: avro. Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".; at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:647)

like image 558
Achilleus Avatar asked Mar 15 '26 21:03

Achilleus


1 Answers

Upon reading up Apache Avro Data Source Guide, I figured build.sbt needs to be updated with a new dependency.

val sparkVersion = "2.4.0"
"org.apache.spark" %% "spark-avro" % sparkVersion

Everything worked fine after this.

like image 155
Achilleus Avatar answered Mar 18 '26 11:03

Achilleus



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!