Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

How to execute .sql file in spark using python

Tags:

python

apache-spark

apache-spark-sql

pyspark

from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext

conf = SparkConf().setAppName("Test").set("spark.driver.memory", "1g")
sc = SparkContext(conf = conf)

sqlContext = SQLContext(sc)

results = sqlContext.sql("/home/ubuntu/workload/queryXX.sql")

When I execute this command using: python test.py it gives me an error.

y4j.protocol.Py4JJavaError: An error occurred while calling o20.sql. : java.lang.RuntimeException: [1.1] failure: ``with'' expected but `/' found

/home/ubuntu/workload/queryXX.sql

at scala.sys.package$.error(package.scala:27)

I am very new to Spark and I need help here to move forward.

like image

535

asked Oct 06 '15 03:10

yguw

People also ask

Can we use SQL queries directly in Spark?

Spark SQL lets you query structured data inside Spark programs, using either SQL or a familiar DataFrame API. Usable in Java, Scala, Python and R. Apply functions to results of SQL queries.

2 Answers

SqlContext.sql expects a valid SQL query not a path to the file. Try this:

with open("/home/ubuntu/workload/queryXX.sql") as fr:
   query = fr.read()
results = sqlContext.sql(query)

like image

126

answered Oct 13 '22 19:10

zero323

Run spark-sql --help will give you

CLI options:
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

So you can execute your sql script like this:

spark-sql -f <your-script>.sql

like image

4

answered Oct 13 '22 19:10

Jiacai Liu

Sign in to Comment

Related questions
                            
                                insert ignore pandas dataframe into mysql
                            
                                Extracting URL and anchor text from Markdown using Python
                            
                                SAWarning when querying with SQLAlchemy into pandas df
                            
                                Plotting the data with scrollable x (time/horizontal) axis on Linux
                            
                                What's difference between findall() and iterfind() of xml.etree.ElementTree
                            
                                Python Test If Point is in Rectangle
                            
                                Comparing date strings in python
                            
                                Manual split versus Scikit Grid Search
                            
                                writing "dictionary of dictionaries" to .csv file in a particular format
                            
                                Python cassandra driver: Invalid or unsupported protocol version: 4
                            
                                Python: Check if Wikipedia Article Exists
                            
                                Is Python 3.5's grammar LL(1)?
                            
                                Python is 'key in dict' different/faster than 'key in dict.keys()' [duplicate]
                            
                                Pop up the window of exsiting view through click event in odoo
                            
                                call_command argument is required
                            
                                PySide: 'PySide.QtCore.Signal' object has no attribute 'emit'
                            
                                Gettings settings and config from INI file for Pyramid functional testing
                            
                                Cookiecutter created directory giving me issues running development server and python shell
                            
                                Get nth byte of integer
                            
                                IPython 4 shell does not work with Sublime REPL

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With