Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark can access Hive table from pyspark but not from spark-submit

So, when running from pyspark i would type in (without specifying any contexts) :

df_openings_latest = sqlContext.sql('select * from experian_int_openings_latest_orc')

.. and it works fine.

However, when i run my script from spark-submit, like

spark-submit script.py i put the following in

from pyspark.sql import SQLContext
from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName('inc_dd_openings')
sc = SparkContext(conf=conf)
sqlContext = SQLContext(sc)

df_openings_latest = sqlContext.sql('select * from experian_int_openings_latest_orc')

But it gives me an error

pyspark.sql.utils.AnalysisException: u'Table not found: experian_int_openings_latest_orc;'

So it doesnt see my table.

What am I doing wrong? Please help

P.S. Spark version is 1.6 running on Amazon EMR

like image 724
Denys Avatar asked Apr 01 '16 15:04

Denys


People also ask

How do I enable Hive support in Pyspark shell?

to connect to hive metastore you need to copy the hive-site. xml file into spark/conf directory. After that spark will be able to connect to hive metastore. which version spark are you using?

Can spark read Hive internal table?

It's a known issue. You get that error because you're trying to read Hive ACID table but Spark still doesn't have support for this.


1 Answers

Spark 2.x

The same problem may occur in Spark 2.x if SparkSession has been created without enabling Hive support.

Spark 1.x

It is pretty simple. When you use PySpark shell, and Spark has been build with Hive support, default SQLContext implementation (the one available as a sqlContext) is HiveContext.

In your standalone application you use plain SQLContext which doesn't provide Hive capabilities.

Assuming the rest of the configuration is correct just replace:

from pyspark.sql import SQLContext

sqlContext = SQLContext(sc)

with

from pyspark.sql import HiveContext

sqlContext = HiveContext(sc)
like image 101
zero323 Avatar answered Oct 12 '22 23:10

zero323