First, I need to apologize because I'm not able to provide a clear MCVE for my question yet. My question is about a strange phenomenon I encountered deep in a code base and I would like to understand how this can happen, so in a way I'm asking how I can create the MCVE for this phenomenon in the first place.
How can it be that it matters whether or not an as
clause is used in a with
statement where the assigned variable isn't used at all?
We are using Airflow (the Apache project), and in there a class called DAG
exists. This class is supposed to be used as a context manager for with
clauses like this:
with DAG(**some_parameters) as dag:
do_something_with(dag)
This works as expected.
However, in some cases we do not use the dag
variable in the with
clause, and therefore IDEs warn, and next to renaming it to _dag
instead (to declare the non-usage), I tried removing the as dag
clause completely:
with DAG(**some_parameters):
do_something_without_passing_dag()
To my understanding of Python this should be equivalent to the version with the as dag
clause at runtime:
with DAG(**some_parameters) as dag:
do_something_without_passing_dag()
But, surprisingly, in the context of the Airflow project, there seems to be a difference between the two. With the as dag
clause the code works as expected; without the as dag
clause, an error is shown (see at the end of this post). Distressingly, this error appears in the log of the Airflow process and does not contain references to my code at all.
I need to point out that in the Airflow context, these with
statements are on the toplevel of a small module, so the as
statement creates a module-global variable if present. I don't know if this is relevant. If so, I don't understand why.
To my understanding, it should never make any difference whether I'm providing an as
clause or not if I do not use the variable at all. Here it seems to be the case nevertheless.
I already investigated three aspects:
__enter__()
method of the DAG
class. In both cases both input (arguments) and output (return value) were the same (return value was a context manager object of course). So here there didn't seem to be any difference based on the existence of the as
clause.as
clause, in the with
clause I deleted the variable (del dag
) as a first statement. Then this version behaved like the version without the as
clause, i. e. it raised an error.__enter__()
method, it stores the current context object in a DagContext
class, and the do_something_without_passing_dag()
can (and will) access the DAG
object from the DagContext
. But since this all is independent from the variable created with the as
statement, I don't see how this could matter.Can anybody provide an explanation on why this can be the case?
Here the error stack trace I can find in the Airflow log:
webserver_1 | Traceback (most recent call last):
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
webserver_1 | response = self.full_dispatch_request()
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
webserver_1 | rv = self.handle_user_exception(e)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
webserver_1 | reraise(exc_type, exc_value, tb)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
webserver_1 | raise value
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
webserver_1 | rv = self.dispatch_request()
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
webserver_1 | return self.view_functions[rule.endpoint](**req.view_args)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 69, in inner
webserver_1 | return self._run_view(f, *args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask_admin/base.py", line 368, in _run_view
webserver_1 | return fn(self, *args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/flask_login/utils.py", line 258, in decorated_view
webserver_1 | return func(*args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/airflow/www/utils.py", line 281, in wrapper
webserver_1 | return f(*args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
webserver_1 | return func(*args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/airflow/www/views.py", line 1958, in paused
webserver_1 | models.DagModel.get_dagmodel(dag_id).set_is_paused(is_paused=is_paused)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
webserver_1 | return func(*args, **kwargs)
webserver_1 | File "/usr/local/lib/python3.7/site-packages/airflow/models/dag.py", line 1562, in set_is_paused
webserver_1 | subdags = self.get_dag().subdags
webserver_1 | AttributeError: 'NoneType' object has no attribute 'subdags'
Your do_something_without_passing_dag() is not supposed to know that DAG(**some_parameters) should be passed in its param "dag".
For instance, this works:
dag=DAG(**some_parameters)
with dag:
do_something_without_passing_dag()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With