Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I use an Airflow variable inside a Databricks notebook?

I have a Databricks PySpark notebook that gets called from an Airflow DAG. I created a variable in Airflow by going to Admin - Variables and added a key-value pair.

I cannot find a way to use that Airflow variable in Databricks.

Edit to add sample of my code.

notebook_task = {
    'notebook_path': '/Users/[email protected]/myDAG',
    'base_parameters': {
        "token": token
    }
}

and the operator defined here

opr_submit_run = DatabricksSubmitRunOperator(
    task_id='run_notebook',
    existing_cluster_id='xxxxx', 
    run_name='test',
    databricks_conn_id='databricks_xxx',  
    notebook_task=notebook_task
)

What ended up working is using base_parameters instead of notebook_parans which can be found here https://docs.databricks.com/dev-tools/api/latest/jobs.html

and accessing it from databricks by using

my_param = dbutils.widgets.get("token")
like image 457
ggofthejungle Avatar asked Sep 20 '25 05:09

ggofthejungle


1 Answers

if you set it as a parameter to the notebook call (parameters inside notebook_task), then you need to use the dbutils.widgets.get function, put at the beginning of notebook something like this:

my_param = dbutils.widgets.get("key")
like image 57
Alex Ott Avatar answered Sep 23 '25 10:09

Alex Ott