Is there a way to execute Spark code locally with databricks-connect?
The reason is that I would like to execute some tests as part of my CI/CD pipeline without the need to have a cluster up and running.
Unfortunately, Local instance of databricks is not available. Only way to use Databricks is via cloud only. Databricks is available from Microsoft and AWS . If you want to test databricks, you can use Databricks community Edition which is free of cost.
Run a Spark SQL jobIn the left pane, select Azure Databricks. From the Common Tasks, select New Notebook. In the Create Notebook dialog box, enter a name, select Python as the language, and select the Spark cluster that you created earlier. Select Create.
To run a shell command on all nodes, use an init script. %fs : Allows you to use dbutils filesystem commands. For example, to run the dbutils.fs.ls command to list files, you can specify %fs ls instead. For more information, see How to work with files on Azure Databricks.
No, databricks-connect
requires a running cluster.
If you do not use any databricks specific code (like dbutils
) you can run spark locally and execute against that - assuming you can still access the data sources you need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With