Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloud Composer (Airflow) jobs stuck

My Cloud Composer managed Airflow got stuck for hours since I've canceled a Task Instance that was taking too long (Let's call it Task A)

I've cleared all the DAG Runs and task instances, but there are a few jobs running and one job with Shutdown state (I suppose the job of Task A) (snapshot of my Jobs).

Besides, it seems that the scheduler is not running since recently deleted DAGs keep appearing in the dashboard

Is there a way to kill the jobs or reset the scheduler? Any idea to un-stuck the composer will be welcomed.

like image 500
Ary Jazz Avatar asked Aug 15 '18 13:08

Ary Jazz


People also ask

How do I restart my scheduler airflow?

How do I restart Airflow Services? You can do start/stop/restart actions on an Airflow service and the commands used for each service are given below: Run sudo monit <action> scheduler for Airflow Scheduler. Run sudo monit <action> webserver for Airflow Webserver.

Is Cloud Composer same as airflow?

Cloud Composer is built on the popular Apache Airflow open source project and operates using the Python programming language. By using Cloud Composer instead of a local instance of Apache Airflow, you can benefit from the best of Airflow with no installation or management overhead.


1 Answers

You can restart the scheduler as follows:

From your cloud shell:

1.Determine your environment’s Kubernetes cluster:

gcloud composer environments describe ENVIRONMENT_NAME \
    --location LOCATION 

2.Get credentials and connect to the Kubernetes cluster:

gcloud container clusters get-credentials ${GKE_CLUSTER} --zone ${GKE_LOCATION}

3.Run the following command to restart the scheduler:

kubectl get deployment airflow-scheduler -o yaml | kubectl replace --force -f -

Steps 1 and 2 are detailed here. Step 3 basically replaces the “airflow-scheduler” deployment with itself, thus restarting the service.

If restarting the scheduler doesn’t help you may as well need to recreate your Composer Environment and Troubleshoot your DAGs if this happens every time.

like image 99
ch_mike Avatar answered Oct 20 '22 00:10

ch_mike