I created a very simple DAG to execute a Python file using PythonOperator. I'm using docker image to run Airflow but it doesn't recognize a module where I have my .py file
The structure is like this:
main_dag.py
plugins/__init__.py
plugins/njtransit_scrapper.py
plugins/sql_queries.py
plugins/config/config.cfg
cmd to run docker airflow image:
docker run -p 8080:8080 -v /My/Path/To/Dags:/usr/local/airflow/dags puckel/docker-airflow webserver
I already tried airflow initdb
and restarting the web server but it keeps showing the error ModuleNotFoundError: No module named 'plugins'
For the import statement I'm using:
from plugins import njtransit_scrapper
This is my PythonOperator:
tweets_load = PythonOperator(
task_id='Tweets_load',
python_callable=njtransit_scrapper.main,
dag=dag
)
My njtransit_scrapper.py file is just a file that collects all tweets for a tweeter account and saves the result in a Postgres database.
If I remove the PythonOperator code and imports the code works fine. I already test almost everything but I'm not quite sure if this is a bug or something else.
It's possible that when I created a volume for the docker image, it's just importing the main dag and stopping there causing to not import the entire package?
If the error occurs due to a circular dependency, it can be resolved by moving the imported classes to a third file and importing them from this file. If the error occurs due to a misspelled name, the name of the class in the Python file should be verified and corrected.
You can do it in one of those ways: add your modules to one of the folders that Airflow automatically adds to PYTHONPATH. add extra folders where you keep your code to PYTHONPATH. package your code into a Python package and install it together with Airflow.
To create a DAG in Airflow, you always have to import the DAG class. After the DAG class, come the imports of Operators. Basically, for each Operator you want to use, you have to make the corresponding import. For example, you want to execute a Python function, you have to import the PythonOperator.
To help others who might land on this page and get this error because of the same mistake I did, I will record it here.
I had an unnecessary __init__.py
file in dags/
folder.
Removing it solved the problem, and allowed all the dags to find their dependency modules.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With