Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to execute .sql file in spark using python

from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext

conf = SparkConf().setAppName("Test").set("spark.driver.memory", "1g")
sc = SparkContext(conf = conf)

sqlContext = SQLContext(sc)

results = sqlContext.sql("/home/ubuntu/workload/queryXX.sql")

When I execute this command using: python test.py it gives me an error.

y4j.protocol.Py4JJavaError: An error occurred while calling o20.sql. : java.lang.RuntimeException: [1.1] failure: ``with'' expected but `/' found

/home/ubuntu/workload/queryXX.sql

at scala.sys.package$.error(package.scala:27)

I am very new to Spark and I need help here to move forward.

like image 535
yguw Avatar asked Oct 06 '15 03:10

yguw


People also ask

Can we use SQL queries directly in Spark?

Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Usable in Java, Scala, Python and R. Apply functions to results of SQL queries.


2 Answers

SqlContext.sql expects a valid SQL query not a path to the file. Try this:

with open("/home/ubuntu/workload/queryXX.sql") as fr:
   query = fr.read()
results = sqlContext.sql(query)
like image 126
zero323 Avatar answered Oct 13 '22 19:10

zero323


Run spark-sql --help will give you

CLI options:
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

So you can execute your sql script like this:

spark-sql -f <your-script>.sql

like image 4
Jiacai Liu Avatar answered Oct 13 '22 19:10

Jiacai Liu