This is probably a stupid newbie mistake, but I'm getting an error running what I thought was basic Scala code (in a Spark notebook, via Jupyter notebook):
val sampleDF = spark.read.parquet("/data/my_data.parquet")
sampleDF
.limit(5)
.write
.format("jdbc")
.option("url", "jdbc:sqlserver://sql.example.com;database=my_database")
.option("dbtable", "my_schema.test_table")
.option("user", "foo")
.option("password", "bar")
.save()
The error:
<console>:1: error: illegal start of definition
.limit(5)
^
What am I doing wrong?
Don't know anything about jupyter internals, but I suspect that it's an artifact from the jupyter-repl interaction. sampleDF
is for some reason considered to be a complete statement on its own. Try
(sampleDF
.limit(5)
.write
.format("jdbc")
.option("url", "jdbc:sqlserver://sql.example.com;database=my_database")
.option("dbtable", "my_schema.test_table")
.option("user", "foo")
.option("password", "bar")
.save())
Jupyter would try to interpret each line as a complete command, so sampleDF
is first interpreted as a valid expression, and then it moves to the next line, producing an error. Move the dots to the previous line, to let the interpreter know that "there's more stuff coming":
sampleDF.
limit(5).
write.
format("jdbc").
option("url", "jdbc:sqlserver://sql.example.com;database=my_database").
option("dbtable", "my_schema.test_table").
option("user", "foo").
option("password", "bar").
save()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With