Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing arguments to sql template from airflow operator

Tags:

python

airflow

If I am using a BigQueryOperator with a SQL Template, how could I pass an argument to the SQL?

File: .sql/query.sql

SELECT * FROM `dataset.{{ task_instance.variable_for_execution }}

File: dag.py

BigQueryOperator(
    task_id='compare_tables',
    sql='./sql/query.sql',
    use_legacy_sql=False,
    dag=dag,
)
like image 938
Rob Avatar asked Aug 30 '18 18:08

Rob


People also ask

Why can’t I use templates with a given parameter in airflow?

The reason is that Airflow defines which parameter can be templated or not. All parameters can’t be templated. In order to know if you can use templates with a given parameter, you have two ways: The first way is by checking at the documentation.

How do I pass arguments to a callable in airflow?

Use the op_args and op_kwargs arguments to pass additional arguments to the Python callable. When you set the provide_context argument to True, Airflow passes in an additional set of keyword arguments: one for each of the Jinja template variables and a templates_dict argument.

What is Apache Airflow SQL Server integration?

Airflow SQL Server Integration helps users execute SQL commands for extracting and loading data, calling a stored procedure, etc from the Database. In this article, you will learn about Apache Airflow, Microsoft SQL Server, and the steps to set up Airflow SQL Server Integration.

How to create operators in Apache Airflow?

The extensibility is one of the many reasons which makes Apache Airflow powerful. You can create any operator you want by extending the airflow.models.baseoperator.BaseOperator There are two methods that you need to override in a derived class: Constructor - Define the parameters required for the operator.


Video Answer


1 Answers

You can pass an argument in params parameter which can be used in the templated field as follows:

BigQueryOperator(
    task_id='',
    sql='SELECT * FROM `dataset.{{ params.param1 }}',
    params={
        'param1': 'value1',
        'param2': 'value2'
    },
    use_legacy_sql=False,
    dag=dag
)

OR you can have the SQL separate in file:

File: ./sql/query.sql

SELECT * FROM `dataset.{{ params.param1 }}

params parameter's input should be a dictionary. In general, any operator in Airflow can be passed this params parameter.

like image 124
kaxil Avatar answered Oct 21 '22 10:10

kaxil