Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a new cluster in Databricks using databricks-cli

I'm trying to create a new cluster in Databricks on Azure using databricks-cli.

I'm using the following command:

databricks clusters create --json '{ "cluster_name": "template2", "spark_version": "4.1.x-scala2.11" }'

And getting back this error:

Error: {"error_code":"INVALID_PARAMETER_VALUE","message":"Missing required field: size"}

I can't find documentation on this issue, would be happy to receive some help.

like image 688
Mor Shemesh Avatar asked Jun 06 '18 13:06

Mor Shemesh


People also ask

How do I start a cluster in Databricks Azure?

Use the Clusters API to restart a cluster. Use the script that Azure Databricks provides that determines how long your clusters have run, and optionally restarts them if they exceed a specified number of days since they were started.


2 Answers

I found the right answer here.

The correct format to run this command on azure is:

databricks clusters create --json '{ "cluster_name": "my-cluster", "spark_version": "4.1.x-scala2.11", "node_type_id": "Standard_DS3_v2", "autoscale" : { "min_workers": 2, "max_workers": 50 } }'
like image 73
Mor Shemesh Avatar answered Oct 26 '22 13:10

Mor Shemesh


Just to add to the answer that @MorShemesh gave, you can also use a path to a JSON file instead of specifying the JSON at the command line.

databricks clusters create --json-file /path/to/my/cluster_config.json 

If you are managing lots of clusters this might be an easier approach.

like image 28
Raphael K Avatar answered Oct 26 '22 13:10

Raphael K