Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark: unable to load native-hadoop library for platform

I am trying to start with Spark. I have Hadoop (3.3.1) and Spark (3.2.2) in my library. I have set the SPARK_HOME, PATH, HADOOP_HOME and LD_LIBRARY_PATH to their respective paths. I am also running JDK 17 (echo and -version work fine in the terminal).

Yet, I still get the following error:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/10/25 17:17:07 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module @0x1f508f09) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module @0x1f508f09
  at org.apache.spark.storage.StorageUtils$.<init>(StorageUtils.scala:213)
  at org.apache.spark.storage.StorageUtils$.<clinit>(StorageUtils.scala)
  at org.apache.spark.storage.BlockManagerMasterEndpoint.<init>(BlockManagerMasterEndpoint.scala:110)
  at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:348)
  at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:287)
  at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
  at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:191)
  at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
  at org.apache.spark.SparkContext.<init>(SparkContext.scala:460)
  at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
  at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
  at scala.Option.getOrElse(Option.scala:189)
  at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
  at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106)
  ... 55 elided
<console>:14: error: not found: value spark
       import spark.implicits._
              ^
<console>:14: error: not found: value spark
       import spark.sql
              ^
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.2.0
      /_/
         
Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 17.0.1)
Type in expressions to have them evaluated.
Type :help for more information.

Any ideas how to fix this?

like image 471
CoderKasper Avatar asked Oct 25 '21 15:10

CoderKasper


1 Answers

Java 17 isn't supported - Spark runs on Java 8/11 (source: https://spark.apache.org/docs/latest/).

So install Java 11 and point Spark to that.

The warning unable to load native-hadoop library for platform is quite common and doesn't mean that anything's wrong.

like image 172
Ben Watson Avatar answered Oct 19 '22 17:10

Ben Watson