Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scala Spark - illegal start of definition

This is probably a stupid newbie mistake, but I'm getting an error running what I thought was basic Scala code (in a Spark notebook, via Jupyter notebook):

val sampleDF = spark.read.parquet("/data/my_data.parquet")

sampleDF
  .limit(5)
  .write
  .format("jdbc")
  .option("url", "jdbc:sqlserver://sql.example.com;database=my_database")
  .option("dbtable", "my_schema.test_table")
  .option("user", "foo")
  .option("password", "bar")
  .save()

The error:

<console>:1: error: illegal start of definition
    .limit(5)
    ^

What am I doing wrong?

like image 375
shadowtalker Avatar asked Jul 10 '18 17:07

shadowtalker


2 Answers

Don't know anything about jupyter internals, but I suspect that it's an artifact from the jupyter-repl interaction. sampleDF is for some reason considered to be a complete statement on its own. Try

(sampleDF
  .limit(5)
  .write
  .format("jdbc")
  .option("url", "jdbc:sqlserver://sql.example.com;database=my_database")
  .option("dbtable", "my_schema.test_table")
  .option("user", "foo")
  .option("password", "bar")
  .save())
like image 196
Andrey Tyukin Avatar answered Oct 20 '22 14:10

Andrey Tyukin


Jupyter would try to interpret each line as a complete command, so sampleDF is first interpreted as a valid expression, and then it moves to the next line, producing an error. Move the dots to the previous line, to let the interpreter know that "there's more stuff coming":

sampleDF.
  limit(5).
  write.
  format("jdbc").
  option("url", "jdbc:sqlserver://sql.example.com;database=my_database").
  option("dbtable", "my_schema.test_table").
  option("user", "foo").
  option("password", "bar").
  save()
like image 43
Alex Savitsky Avatar answered Oct 20 '22 13:10

Alex Savitsky