While defining a function to be later used as a python_callable, why is 'ds' included as the first arg of the function?
For example:
def python_func(ds, **kwargs):
pass
I looked into the Airflow documentation, but could not find any explanation.
On daily tasks, using ds (an Airflow Variable that allows you to specify execution date) makes sense because we need to process the data of the previous day.
op_kwargs (dict) – A dict of keyword arguments to pass to python_callable. provide_context (bool) – if set to true, Airflow will pass a set of keyword arguments that can be used in your function. This set of kwargs correspond exactly to what you can use in your jinja templates.
This is related to the provide_context=True
parameter. As per Airflow documentation,
if set to true, Airflow will pass a set of keyword arguments that can be used in your function. This set of kwargs correspond exactly to what you can use in your jinja templates. For this to work, you need to define **kwargs in your function header.
ds
is one of these keyword arguments and represents execution date in format "YYYY-MM-DD". For parameters that are marked as (templated) in the documentation, you can use '{{ ds }}'
default variable to pass the execution date. You can read more about default variables here:
https://pythonhosted.org/airflow/code.html?highlight=pythonoperator#default-variables (obsolete)
https://airflow.incubator.apache.org/concepts.html?highlight=python_callable
PythonOperator doesn't have templated parameters, so doing something like
python_callable=print_execution_date('{{ ds }}')
won't work. To print execution date inside the callable function of your PythonOperator, you will have to do it as
def print_execution_date(ds, **kwargs):
print(ds)
or
def print_execution_date(**kwargs):
print(kwargs.get('ds'))
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With