I'm trying to run a spark stream from a kafka queue containing Avro messages. As per https://spark.apache.org/docs/latest/sql-data-sources-avro.html I should be able to use <code>from_avro</code> to convert column value to <code>Dataset<Row></code>. However, I'm unable to compile the project as it complains <code>from_avro</code> cannot be found. I can see the method declared in package.class of the dependency. How can I use the <code>from_avro</code> method from <code>org.apache.spark.sql.avro</code> in my Java code locally? <pre class="prettyprint"><code>import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; import org.apache.spark.sql.SparkSession; import static org.apache.spark.sql.functions.*; import org.apache.spark.sql.avro.*; public class AvroStreamTest { public static void main(String[] args) throws IOException, InterruptedException { // Creating local sparkSession here... Dataset<Row> df = sparkSession .readStream() .format("kafka") .option("kafka.bootstrap.servers", "host:port") .option("subscribe", "avro_queue") .load(); // Cannot resolve method 'from_avro'... df.select(from_avro(col("value"), jsonFormatSchema)).writeStream().format("console") .outputMode("update") .start(); } } </code></pre> pom.xml: <pre class="prettyprint"><code><dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.4.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.4.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-avro_2.11</artifactId> <version>2.4.0</version> </dependency>  </dependencies> </code></pre> It seems like Java is unable to import names from <code>sql.avro.package.class</code>

It's because of the generated class names, importing it as <code>import org.apache.spark.sql.avro.package$;</code> and then using <code>package$.MODULE$.from_avro(...)</code> should work

Spark 2.4.0 Avro Java - cannot resolve method from_avro

Tags:

java

scala

spark-streaming-kafka

spark-avro

I'm trying to run a spark stream from a kafka queue containing Avro messages.

As per https://spark.apache.org/docs/latest/sql-data-sources-avro.html I should be able to use from_avro to convert column value to Dataset<Row>.

However, I'm unable to compile the project as it complains from_avro cannot be found. I can see the method declared in package.class of the dependency.

How can I use the from_avro method from org.apache.spark.sql.avro in my Java code locally?

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

import static org.apache.spark.sql.functions.*;
import org.apache.spark.sql.avro.*;


public class AvroStreamTest {
    public static void main(String[] args) throws IOException, InterruptedException {

     // Creating local sparkSession here...

        Dataset<Row> df = sparkSession
                .readStream()
                .format("kafka")
                .option("kafka.bootstrap.servers", "host:port")
                .option("subscribe", "avro_queue")
                .load();

        // Cannot resolve method 'from_avro'...
        df.select(from_avro(col("value"), jsonFormatSchema)).writeStream().format("console")
                .outputMode("update")
                .start();


    }
}

pom.xml:

<dependencies>
    <dependency> 
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-avro_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
  <!-- more dependencies below -->

</dependencies>

It seems like Java is unable to import names from sql.avro.package.class

992

asked Mar 06 '19 15:03

Maciej C

1 Answers

It's because of the generated class names, importing it as import org.apache.spark.sql.avro.package$; and then using package$.MODULE$.from_avro(...) should work

106

answered Sep 20 '22 00:09

ollik1

Related questions
                            
                                mockito unit test cases for Response entity performing HTTP GET request
                            
                                How to fix [ERROR] Hint: Check that your module inherits 'com.google.gwt.core.Core'
                            
                                copying each field of an object of Class A to each field of an object of classB in list
                            
                                Number of CPU cores for parallel stream Java 8
                            
                                Maven - Executable file from a java project
                            
                                Difference between JAR, Fat JAR, Executable JAR
                            
                                How to solve FlyWay license problem in Spring Boot Application
                            
                                How to wait for invisibility of an element through PageFactory using Selenium and Java
                            
                                Changes in SSLEngine usage when going up to TLSv1.3
                            
                                Deploy servlet with IntelliJ IDEA to local Tomcat server
                            
                                Random permutation of IntStream
                            
                                Use cases of jvm dup instruction
                            
                                Best way to map query result to DTO
                            
                                Filter objects from a list that have the same member
                            
                                Springboot/Cucumber Integration-How to use @Autowired(or other SB tags) inside step definition class?
                            
                                How to debug NullPointerException at apache.jena.queryExecutionFactory during create?
                            
                                Merging duplicate code that use different objects
                            
                                How to deserialize an EnumMap
                            
                                TreeSet instance in Set reference variable
                            
                                How to implement Firebase Recycler Adapter in newer version of Android 3.1 and higher?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With