I try to call a R notebook on Databricks while passing parameters using spark-submit.
My approach looks like this:
com <- "spark-submit foo.R p1 & spark-submit foo.R p2"
system(com)
This should call the script foo.Rand hand over the parameter p1.
This returns:
sh: 1: spark-submit: not found
sh: 1: spark-submit: not found
I would expect that this submits the two jobs to the Spark cluster. Any help what I am missing? Thanks!
I assume you attempted to run these commands in a R notebook. The standard way to call other notebooks from a Databricks notebook is dbutils.notebook.run. Currently it only works in Python and Scala.
You can work around it by adding a Python cell to your R notebook:
%python
dbutils.notebook.run("foo.R", 60, {"argument": "p1"})
dbutils.notebook.run("foo.R", 60, {"argument": "p2"})
In case you generate notebook parameters p1 and p2 in R, you can use a temporary view to pass them to the Python cell.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With