Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

You must build Spark with Hive. Export 'SPARK_HIVE=true'

I'm trying to run a notebook on Analytics for Apache Spark running on Bluemix, but I hit the following error:

Exception: ("You must build Spark with Hive. Export 'SPARK_HIVE=true' and 
run build/sbt assembly", Py4JJavaError(u'An error occurred while calling 
None.org.apache.spark.sql.hive.HiveContext.\n', JavaObject id=o38))

The error is intermittent - it doesn't always happen. The line of code in question is:

df = sqlContext.read.format('jdbc').options(
            url=url, 
            driver='com.ibm.db2.jcc.DB2Driver', 
            dbtable='SAMPLE.ASSETDATA'
        ).load()

There are a few similar questions on stackoverflow, but they aren't asking about the spark service on bluemix.

like image 834
Chris Snow Avatar asked Dec 14 '22 05:12

Chris Snow


1 Answers

That statement initializes a HiveContext under the covers. The HiveContext then initializes a local Derby database to hold its metadata. The Derby database is created in the current directory by default. The reported problem occurs under these circumstances (among others):

  1. The Derby database already exists, and there are leftover lock files because the notebook kernel that last used it didn't shut down properly.
  2. The Derby database already exists, and is currently in use by another notebook kernel that also initialized a HiveContext.

Until IBM changes the default setup to avoid this problem, possible workarounds are:

  • For case 1, delete the leftover lockfiles. From a Python notebook, this is done by executing:

    !rm -f ./metastore_db/*.lck
    
  • For case 2, change the current working directory before the Hive context is created. In a Python notebook, this will change into a newly created directory:

    import os
    import tempfile
    os.chdir(tempfile.mkdtemp())
    

    But beware, it will clutter the filesystem with a new directory and Derby database each time you run that notebook.

I happen to know that IBM is working on a fix. Please use these workarounds only if you encounter the problem, not proactively.

like image 184
Roland Weber Avatar answered Feb 11 '23 04:02

Roland Weber