I try to configure Airbnb AirFlow to use the CeleryExecutor like this:
I changed the executer
in the airflow.cfg from SequentialExecutor
to CeleryExecutor
:
# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor
executor = CeleryExecutor
But I get the following error:
airflow.configuration.AirflowConfigException: error: cannot use sqlite with the CeleryExecutor
Note that the sql_alchemy_conn
is configured like this:
sql_alchemy_conn = sqlite:////root/airflow/airflow.db
I looked at Airflow's GIT (https://github.com/airbnb/airflow/blob/master/airflow/configuration.py)
and found that the following code throws this exception:
def _validate(self):
if (
self.get("core", "executor") != 'SequentialExecutor' and
"sqlite" in self.get('core', 'sql_alchemy_conn')):
raise AirflowConfigException("error: cannot use sqlite with the {}".
format(self.get('core', 'executor')))
It seems from this validate
method that the sql_alchemy_conn
cannot contain sqlite
.
Do you have any idea how to configure the CeleryExecutor
without sqllite? please note that I downloaded rabitMQ for working with the CeleryExecuter as required.
Airflow comes configured with the SequentialExecutor by default, which is a local executor, and the safest option for execution, but we strongly recommend you change this to LocalExecutor for small, single-machine installations, or one of the remote executors for a multi-machine/cloud installation.
LocalExecutor runs tasks by spawning processes in a controlled fashion in different modes. Given that BaseExecutor has the option to receive a parallelism parameter to limit the number of process spawned, when this parameter is 0 the number of processes that LocalExecutor can spawn is unlimited.
To set up an airflow cluster, we need to install below components and services: Airflow Webserver: A web interface to query the metadata to monitor and execute DAGs. Airflow Scheduler: It checks the status of the DAG's and tasks in the metadata database, create new ones if necessary, and sends the tasks to the queues.
It is said by AirFlow that the CeleryExecutor
requires other backend than default database SQLite. You have to use MySQL
or PostgreSQL
, for example.
The sql_alchemy_conn
in airflow.cfg
must be changed to follow the SqlAlchemy connection string structure (see SqlAlchemy document)
For example,
sql_alchemy_conn = postgresql+psycopg2://airflow:[email protected]:5432/airflow
To configure Airflow for mysql firstly install mysql this might help or just google it
locate
sql_alchemy_conn = sqlite:////home/vipul/airflow/airflow.db
and add # in front of it so it looks like
#sql_alchemy_conn = sqlite:////home/vipul/airflow/airflow.db
if you have default sqlite
add this line below
sql_alchemy_conn = mysql://:@localhost:3306/
save the file
run command
airflow initdb
and done !
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With