My Airflow DAGs mainly consist of PythonOperators, and I would like to use my Python IDEs debug tools to develop python "inside" airflow. - I rely on Airflow's database connectors, which I think would be ugly to move "out" of airflow for development. I have been using Airflow for a bit, and have so far only achieved development and debugging via the CLI. Which is starting to get tiresome. Does anyone know of a nice way to set up PyCharm, or another IDE, that enables me to use the IDE's debug toolset when running <code>airflow test ..</code>?

It might be somewhat of a hack, but I found one way to set up PyCharm: <ul> <li>Use <code>which airflow</code> to the local airflow environment - which in my case is just a pipenv</li> <li>Add a new run configuration in PyCharm</li> <li>Set the python "Script path" to said airflow script</li> <li>Set Parameters to test a task: <code>test dag_x task_y 2019-11-19</code> </li> </ul> This have only been validated with the SequentialExecutor, which might be important. It sucks that I have to change test parameters in the run configuration for every new debug/development task, but so far this is pretty useful for setting breakpoints and stepping through code while "inside" the local airflow environment.

Debugging Airflow Tasks with IDE tools?

Tags:

python

debugging

ide

pycharm

airflow

My Airflow DAGs mainly consist of PythonOperators, and I would like to use my Python IDEs debug tools to develop python "inside" airflow. - I rely on Airflow's database connectors, which I think would be ugly to move "out" of airflow for development.

I have been using Airflow for a bit, and have so far only achieved development and debugging via the CLI. Which is starting to get tiresome.

Does anyone know of a nice way to set up PyCharm, or another IDE, that enables me to use the IDE's debug toolset when running airflow test ..?

212

asked Nov 19 '19 10:11

Mathias Andersen

Video Answer

4 Answers

For VSCode, the following debug configuration attaches the builtin debugger

    {
        "name": "Airflow Test - Example",
        "type": "python",
        "request": "launch",
        "program": "`pyenv which airflow`",  // or path to airflow 
        "console": "integratedTerminal",
        "args": [ // exact formulation may depend on airflow 1.0 vs 2.0
            "test",
            "mydag",
            "mytask",
            "`date +%Y-%m-%dT00:00:00`", // current date 
            "-sd",
            "path/to/mydag" // providing the subdirectory makes this faster
        ]
    }

I'd assume there are similar configs that work for other IDEs

134

answered Oct 18 '22 22:10

Dan Frank

Might be a little late to the party, but been looking for a solution to this as well. Wanted to be able to debug code as close to "production mode" as possible (so nothing with test etc).

Found a solution in the form of the "Python Debug Server". It works the other way around: Your IDE listens and the connection is made from the remote script to your editor.

Just add a new run configuration of type "Python Debug Server". You'll get a screen telling you to pip install pydevd-pycharm remotely. At that same page you can fill in your local IP and a port on which the debugger should be available and optional path mappings.

After that, just add the proposed 2 lines of code to where you want your debug session to start.

Run the configuration to activate the listener and if all is well your editor should break as soon as the location of the settrace-call is reached.

airflow remote debug

Edit/Note: If you stop the configuration in your editor, airflow will continue with the task, be sure to realise that.

answered Oct 18 '22 22:10

Blizz

It might be somewhat of a hack, but I found one way to set up PyCharm:

Use which airflow to the local airflow environment - which in my case is just a pipenv
Add a new run configuration in PyCharm
Set the python "Script path" to said airflow script
Set Parameters to test a task: test dag_x task_y 2019-11-19

This have only been validated with the SequentialExecutor, which might be important.

It sucks that I have to change test parameters in the run configuration for every new debug/development task, but so far this is pretty useful for setting breakpoints and stepping through code while "inside" the local airflow environment.

answered Oct 18 '22 22:10

Mathias Andersen

I debug airflow test dag_id task_id, run on a vagrant machine, using PyCharm. You should be able to use the same method, even if you're running airflow directly on localhost.

Pycharm's documentation on this subject should show you how to create an appropriate "Python Remote Debug" configuration. When you run this config, it waits to be contacted by the bit of code that you've added someplace (for example in one of your operators). And then you can debug as normal, with breakpoints set in Pycharm.

answered Oct 18 '22 22:10

brki

Related questions
                            
                                Keras: how to get top-k accuracy
                            
                                Exception similar to ModuleNotFoundError in Python 2.7?
                            
                                Cumulative apply within window defined by other columns
                            
                                Variable X not updating when variables that should effect X change
                            
                                Aggregate unique values from multiple columns with pandas GroupBy
                            
                                How to receive and parse email with Cloud Functions?
                            
                                python sklearn get list of available hyper parameters for model
                            
                                Apply list of regex pattern on list python
                            
                                The smtplib.server.sendmail function in python raises UnicodeEncodeError: 'ascii' codec can't encode character
                            
                                Show text inside the tags BeautifulSoup
                            
                                How to build a simple RSS reader in Python 3.7?
                            
                                Cmake could not find boost_python
                            
                                Do JavaScript classes have a method equivalent to Python classes' __call__?
                            
                                How to suppress warning "Access to protected member" in pycharm method?
                            
                                Saving a TF2 keras model with custom signature defs
                            
                                How should I add a field containing a list of dictionaries in Marshmallow Python?
                            
                                speed of elementary mathematical operations in Numpy/Python: why is integer division slowest?
                            
                                How to convert a QByteArray to a python string in PySide2 [duplicate]
                            
                                Counting most common combination of values in dataframe column
                            
                                Tox 0% coverage

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Debugging Airflow Tasks with IDE tools?

Tags:

python

debugging

ide

pycharm

airflow

Mathias Andersen

People also ask

Video Answer

4 Answers

Dan Frank

Blizz

Mathias Andersen

brki

Recent Activity

Donate For Us