Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accessing configuration parameters passed to Airflow through CLI

Tags:

I am trying to pass the following configuration parameters to Airflow CLI while triggering a dag run. Following is the trigger_dag command I am using.

airflow trigger_dag  -c '{"account_list":"[1,2,3,4,5]", "start_date":"2016-04-25"}'  insights_assembly_9900  

My problem is that how can I access the con parameters passed inside an operator in the dag run.

like image 626
devj Avatar asked Apr 27 '17 08:04

devj


People also ask

How do you pass parameters to Airflow DAG?

You can pass parameters from the CLI using --conf '{"key":"value"}' and then use it in the DAG file as "{{ dag_run. conf["key"] }}" in templated field.

How do I access Airflow cfg?

The first time you run Airflow, it will create a file called airflow. cfg in your $AIRFLOW_HOME directory ( ~/airflow by default). This file contains Airflow's configuration and you can edit it to change any of the settings.


2 Answers

This is probably a continuation of the answer provided by devj.

  1. At airflow.cfg the following property should be set to true: dag_run_conf_overrides_params=True

  2. While defining the PythonOperator, pass the following argument provide_context=True. For example:

 get_row_count_operator = PythonOperator(task_id='get_row_count', python_callable=do_work, dag=dag, provide_context=True) 
  1. Define the python callable (Note the use of **kwargs):
 def do_work(**kwargs):         table_name = kwargs['dag_run'].conf.get('table_name')         # Rest of the code 
  1. Invoke the dag from command line:
 airflow trigger_dag read_hive --conf '{"table_name":"my_table_name"}' 

I have found this discussion to be helpful.

like image 56
Arnab Biswas Avatar answered Oct 19 '22 20:10

Arnab Biswas


There are two ways in which one can access the params passed in airflow trigger_dag command.

  1. In the callable method defined in PythonOperator, one can access the params as kwargs['dag_run'].conf.get('account_list')

  2. given the field where you are using this thing is templatable field, one can use {{ dag_run.conf['account_list'] }}

The schedule_interval for the externally trigger-able DAG is set as None for the above approaches to work

like image 28
devj Avatar answered Oct 19 '22 22:10

devj