Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does spark-shell fail with "The root scratch dir: /tmp/hive on HDFS should be writable."?

I am a spark noob, and using windows 10, trying to get spark to work. I have set the environment variables correctly, and I also have winutils. When I go into spark/bin, and type spark-shell, it runs spark but it gives the following errors.

Also it doesn't show the spark context or spark session.

C:\Users\Akshay\Downloads\spark\bin>spark-shell
    17/06/19 23:45:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/06/19 23:45:19 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Users/Akshay/Downloads/spark/bin/../jars/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Users/Akshay/Downloads/spark/jars/datanucleus-api-jdo-3.2.6.jar."
    17/06/19 23:45:20 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Users/Akshay/Downloads/spark/bin/../jars/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Users/Akshay/Downloads/spark/jars/datanucleus-core-3.2.10.jar."
    17/06/19 23:45:20 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Users/Akshay/Downloads/spark/bin/../jars/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Users/Akshay/Downloads/spark/jars/datanucleus-rdbms-3.2.9.jar."
    java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':
      at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:981)
      at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:110)
      at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:109)
      at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
      at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:878)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
      at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
      at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
      at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
      at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
      at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:878)
      at org.apache.spark.repl.Main$.createSparkSession(Main.scala:96)
      ... 47 elided
    Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
      at java.lang.reflect.Constructor.newInstance(Unknown Source)
      at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:978)
      ... 58 more
    Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
      at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:169)
      at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:86)
      at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
      at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:101)
      at scala.Option.getOrElse(Option.scala:121)
      at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:101)
      at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:100)
      at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:157)
      at org.apache.spark.sql.hive.HiveSessionState.<init>(HiveSessionState.scala:32)
      ... 63 more
    Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
      at java.lang.reflect.Constructor.newInstance(Unknown Source)
      at org.apache.spark.sql.internal.SharedState$.org$apache$spark$sql$internal$SharedState$$reflect(SharedState.scala:166)
      ... 71 more
    Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
      at java.lang.reflect.Constructor.newInstance(Unknown Source)
      at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
      at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:358)
      at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:262)
      at org.apache.spark.sql.hive.HiveExternalCatalog.<init>(HiveExternalCatalog.scala:66)
      ... 76 more
    Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
      at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
      at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:188)
      ... 84 more
    Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
      at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
      at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
      at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
      ... 85 more
    <console>:14: error: not found: value spark
           import spark.implicits._
                  ^
    <console>:14: error: not found: value spark
           import spark.sql
                  ^
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.1.1
          /_/

    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
    Type in expressions to have them evaluated.
    Type :help for more information.

    scala>    

How do I resolve this?

like image 569
Akshay Avatar asked Jun 20 '17 05:06

Akshay


2 Answers

As per the error message:

Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: ---------
  at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:612)
  at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
  at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
  ... 85 more

You should change the permissions of C:\tmp\hive directory as follows:

winutils.exe chmod -R 777 C:\tmp\hive
like image 140
Jacek Laskowski Avatar answered Nov 07 '22 12:11

Jacek Laskowski


Please refer to this article where was described how to run spark on windows 10 with hadoop support. Spark on windows

like image 2
FaigB Avatar answered Nov 07 '22 12:11

FaigB