Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark Twitter Streaming exception : (org.apache.spark.Logging) classnotfound

I am trying Spark Twitter Streaming example with Scala using Maven but I am getting below error when I run it:

Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging

Below are my dependencies:

<dependencies>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming_2.10</artifactId>
    <version>2.0.0</version>
</dependency> 
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-twitter_2.11</artifactId>
    <version>1.6.2</version> 
</dependency> 

I know that the Logging has been moved to org.apache.spark.internal.Logging but I don't know if it is the reason , I already tried to change the version of dependencies to the latest one but with no luck.

like image 335
Ahmad Abu-Hamideh Avatar asked Aug 11 '16 10:08

Ahmad Abu-Hamideh


1 Answers

TLDR;

Class org.apache.spark.Logging is available in Spark version 1.5.2 or lower (though I didn't test on all lower versions) but is not available in versions higher than the same.


It all comes down to using incompatible version of Apache Spark:

1. Let's try to import org.apache.spark.Logging on Spark 2.0.0:

user@ubuntu:~$ /opt/spark/bin/spark-shell
Welcome to
  ____              __
 / __/__  ___ _____/ /__
_\ \/ _ \/ _ `/ __/  '_/
/___/ .__/\_,_/_/ /_/\_\   version 2.0.0
   /_/      
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
scala> import org.apache.spark.Logging
<console>:23: error: object Logging is not a member of package org.apache.spark
import org.apache.spark.Logging
          ^

Class org.apache.spark.Logging is not found.


2. Let's try to import org.apache.spark.Logging on Spark 1.6.2:

(same as above i.e. Class org.apache.spark.Logging is not found.)


3. Let's try to import org.apache.spark.Logging on Spark 1.5.2:

user@ubuntu:~$ /opt/spark-1.5.2-bin-hadoop2.6/bin/spark-shell
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.5.2
      /_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
scala> import org.apache.spark.Logging
import org.apache.spark.Logging

YES! It is available and successfully imported

As you can see that org.apache.spark.Logging which is required by the Spark-Streaming-Twitter, is available in Spark version 1.5.2 or lower, so I would recommend you to use 1.5.2 or a lower version of spark.

Hence, you should replace your maven dependencies with followings: (Assuming that you are using Scala 2.11.x)

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>1.5.2</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming_2.11</artifactId>
    <version>1.5.2</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-twitter_2.11</artifactId>
    <version>1.6.2</version>
</dependency>

Note that the artifactId: 2.11 refers to scala version and version: 1.5.2 or 1.6.2 refers to the library (spark-core or spark-streaming-twitter) version.

like image 79
Ajeet Shah Avatar answered Oct 12 '22 19:10

Ajeet Shah