Spark SQL: INSERT INTO statement syntax

Tags:

apache-spark

apache-spark-sql

While reading the Datastax docs for supported syntax of Spark SQL, I noticed you can use INSERT statements like you would normally do:

INSERT INTO hello (someId,name) VALUES (1,"hello")

Testing this out in a Spark 2.0 (Python) environment and a connection to a Mysql database, throws the error:

File "/home/yawn/spark-2.0.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 73, in deco
pyspark.sql.utils.ParseException: 
u'\nmismatched input \'someId\' expecting {\'(\', \'SELECT\', \'FROM\', \'VALUES\', \'TABLE\', \'INSERT\', \'MAP\', \'REDUCE\'}(line 1, pos 19)\n\n== SQL ==\nINSERT INTO hello (someId,name) VALUES (1,"hello")\n-------------------^^^\n'

However if I remove the explicit column definition, it works as expected:

INSERT INTO hello VALUES (1,"hello")

Am I missing something?

749

asked Oct 23 '16 16:10

TMichel

1 Answers

Spark support hive syntax so if you want to insert row you can do as follows

insert into hello select t.* from (select 1, 'hello') t;

164

answered Oct 21 '22 19:10

Sandeep Purohit

Related questions
                            
                                InvalidRequestException(why:empid cannot be restricted by more than one relation if it includes an Equal)
                            
                                Apache Spark (MLLib) for real time analytics
                            
                                how to fetch all of data from hbase table in spark
                            
                                Can I use Hadoop with AWS4-HMAC-SHA256?
                            
                                Why does Spark submit script spark-submit ignore `--num-executors`?
                            
                                How does the Apache Spark scheduler split files into tasks?
                            
                                How to let Spark serialize an object using Kryo?
                            
                                Spark job failing when calling first() in PySpark
                            
                                Apache Spark ALS recommendations approach
                            
                                In Apache Spark SQL, How to close metastore connection from HiveContext
                            
                                must build Spark with Hive (spark 1.5.0)
                            
                                Spark partitionBy much slower than without it
                            
                                Combining PyCharm, Spark and Jupyter
                            
                                How to enable streaming from Cassandra to Spark?
                            
                                pySpark: Save ML Model
                            
                                Spark Job submitted - Waiting (TaskSchedulerImpl : Initial job not accepted)
                            
                                Spark performance tuning - number of executors vs number for cores
                            
                                Spark Dataframe Maximum Column Count
                            
                                Run Spark-shell with error :SparkContext: Error initializing SparkContext
                            
                                Spark num-executors

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With