Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

%sh command to install mvn library through notebook in Databrics

There are different ways to install libraries in Databricks for e.g. using the GUI, Databricks CLI etc.

I'm interested in knowing if it is possible to install Maven libraries through "%sh" commands in a Notebook. For example one option to do this from within a Notebook for Python libraries would be:

dbutils.library.installPyPI()

Another option using "%sh" for Python libraries could be to do something like this:

%sh
sudo apt-get install python3-pip -y
pip3 install --upgrade pyodbc

Is there a corresponding "%sh" command for Maven libraries for example anything like this:

%sh
mvn install --maven-coordinates "com.microsoft.azure.kusto:spark-kusto-connector:2.0.0"
like image 415
sbs0202 Avatar asked Sep 14 '25 13:09

sbs0202


1 Answers

No, there is no such command to do from the inside of the notebook, and %sh won't help here because this command will be executed only on the driver node, while the library(-ies) needs to be installed on all nodes of the cluster. You have following alternatives to install library to the cluster:

  1. Specify maven coordinates after you created a cluster
  2. Create library in workspace from Maven coordinates and attach it to the cluster
  3. Install library using the init script that will be executed on all nodes - it's handy for Python or R libraries, but for Maven it could be harder because you'll need to pull dependencies as well
  4. Install library to existing cluster via REST API
  5. Install library to the existing cluster via libraries subcommand of Databricks CLI (it uses REST API under the hood)
  6. Use Databricks Terraform Provider and define cluster or job with libraries
like image 194
Alex Ott Avatar answered Sep 16 '25 20:09

Alex Ott