Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set environment variable in databricks?

Simple question, but I can't find a simple guide on how to set the environment variable in Databricks. Also, is it important to set the environment variable on both the driver and executors (and would you do this via spark.conf)? Thanks

like image 430
information_interchange Avatar asked Jul 02 '19 15:07

information_interchange


Video Answer


1 Answers

Before creation:

You can set environment variable while creating the cluster.

Click on Advanced Options => Enter Environment Variables.

enter image description here

After creation:

Select your cluster => click on Edit => Advance Options => Edit or Enter new Environment Variables => Confirm and Restart.

enter image description here

OR

You can achieve the desired results by appending my environment variable declarations to the file /databricks/spark/conf/spark-env.sh. You may change the init file as follows:

%scala
dbutils.fs.put("dbfs:/databricks/init/set_spark_params.sh","""
  |#!/bin/bash
  |
  |cat << 'EOF' > /databricks/driver/conf/00-custom-spark-driver-defaults.conf
  |[driver] {
  |  "spark.sql.sources.partitionOverwriteMode" = "DYNAMIC"
  |}
  |EOF
  """.stripMargin, true)

For more details, refer “Databricks – Spark Configuration”.

Hope this helps.

like image 82
CHEEKATLAPRADEEP-MSFT Avatar answered Nov 15 '22 11:11

CHEEKATLAPRADEEP-MSFT