Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Call R notebooks on Databricks from second R notebook

I try to call a R notebook on Databricks while passing parameters using spark-submit.

My approach looks like this:

com <- "spark-submit foo.R p1 & spark-submit foo.R p2"
system(com)

This should call the script foo.Rand hand over the parameter p1.

This returns:

 sh: 1: spark-submit: not found
 sh: 1: spark-submit: not found

I would expect that this submits the two jobs to the Spark cluster. Any help what I am missing? Thanks!

like image 281
CKre Avatar asked Dec 16 '25 18:12

CKre


1 Answers

I assume you attempted to run these commands in a R notebook. The standard way to call other notebooks from a Databricks notebook is dbutils.notebook.run. Currently it only works in Python and Scala.

You can work around it by adding a Python cell to your R notebook:

%python
dbutils.notebook.run("foo.R", 60, {"argument": "p1"})
dbutils.notebook.run("foo.R", 60, {"argument": "p2"})

In case you generate notebook parameters p1 and p2 in R, you can use a temporary view to pass them to the Python cell.

like image 61
marat Avatar answered Dec 19 '25 16:12

marat