Spark SQL Stackoverflow

Tags:

I'm a newbie on spark and spark sql and I was trying to make the example that is on Spark SQL website, just a simple SQL query after loading the schema and data from a JSON files directory, like this:

import sqlContext.createSchemaRDD
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val path = "/home/shaza90/Desktop/tweets_1428981780000"
val tweet = sqlContext.jsonFile(path).cache()

tweet.registerTempTable("tweet")
tweet.printSchema() //This one works fine


val texts = sqlContext.sql("SELECT tweet.text FROM tweet").collect().foreach(println)

The exception that I'm getting is this one:

java.lang.StackOverflowError

    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:254)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:222)

Update

I'm able to execute select * from tweet but whenever I use a column name instead of * I get the error.

Any Advice?

707

asked May 02 '15 02:05

Lisa

1 Answers

This is SPARK-5009 and has been fixed in Apache Spark 1.3.0.

The issue was that to recognize keywords (like SELECT) with any case, all possible uppercase/lowercase combinations (like seLeCT) were generated in a recursive function. This recursion would lead to the StackOverflowError you're seeing, if the keyword was long enough and the stack size small enough. (This suggests that if upgrading to Apache Spark 1.3.0 or later is not an option, you can use -Xss to increase the JVM stack size as a workaround.)

105

answered Sep 24 '22 13:09

Daniel Darabos

Related questions
                            
                                Reliability issues with Checkpointing/WAL in Spark Streaming 1.6.0
                            
                                How to solve this error org.apache.spark.sql.catalyst.errors.package$TreeNodeException
                            
                                Spark Streaming: Could not compute split, block not found
                            
                                Parquet error when saving from Spark
                            
                                How to change the attributes order in Apache SparkSQL `Project` operator?
                            
                                Hive partitioned table reads all the partitions despite having a Spark filter
                            
                                Creating a large dictionary in pyspark
                            
                                How to cache a Spark data frame and reference it in another script
                            
                                Evaluating Spark DataFrame in loop slows down with every iteration, all work done by controller
                            
                                Spark DataFrame mapPartitions
                            
                                Apache Spark SQL UDAF over window showing odd behaviour with duplicate input
                            
                                Add a header before text file on save in Spark
                            
                                java.sql.SQLException: No suitable driver found when loading DataFrame into Spark SQL
                            
                                Random numbers generation in PySpark
                            
                                Spark Listener EventLoggingListener threw an exception / ConcurrentModificationException
                            
                                spark pivot without aggregation
                            
                                Spark on K8s - getting error: kube mode not support referencing app depenpendcies in local
                            
                                How many RDDs does DStream generate for a batch interval?
                            
                                Running a Job on Spark 0.9.0 throws error
                            
                                Apache Spark Joins example with Java

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark SQL Stackoverflow

Tags:

apache-spark

apache-spark-sql

Lisa

People also ask

1 Answers

Daniel Darabos

Recent Activity

Donate For Us