I have created a Spark application which uses Hive metastore but in the line of the external Hive table creation, I get such an error when I execute the application (Spark driver logs):
Exception in thread "main" org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxrwxr-x;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:214)
at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:114)
at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:102)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxrwxr-x
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.HiveClientImpl.newState(HiveClientImpl.scala:183)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:117)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
I run the application using the Spark operator for K8s. So I checked the permissions of the directories ob driver pod of the Spark application:
ls -l /tmp
...
drwxrwxr-x 1 1001 1001 4096 Feb 22 16:47 hive
If I try to change permissions it does not make any effect. I run Hive metastore and HDFS in K8s as well.
How this problem can be fixed?
This is a common error which can be fixed by creating a directory at another place and pointing the spark to use the new dir.
Step 1: Create a new dir called tmpops at /tmp/tmpops
Step 2: Give permission for the dir chmod -777 /tmp/tmpops
Note: -777 is for local testing. If you are working with sensitive data make sure to add this path to security groups to avoid accidental data leakage and security loophole.
Step 3: Add the below property in your hive-site.xml that the spark app is referring to:
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/tmpops</value>
</property>
Once you do this, the error will no longer appear unless someone deletes that dir.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With