Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug failed fargate task initialization

I have a fargate task which I have scheduled to run with CloudWatch Event rules, and output a timestamp to a database on a successful run. It also outputs a logfile to CloudWatch for every time it runs.

However, there was 1 time where the log file was not created, and the database not updated. I suspect the task was never even started, or had failed to start.

In CloudWatch, the event rule shows trigger and invocation at the time I expected the task to run, so I assume the task at least attempted to start.

My question is: is there any way I can debug or log information about the cluster failing to start a task?

Please let me know if I need to provide more information.

Edit: I should specify I'm looking for a way to read this information in a log file somewhere. I know I can see failed task reason in the web console, but that's only for relatively recent tasks.

I have posted the same question here: https://www.reddit.com/r/aws/comments/adtqvt/debugging_failed_fargate_task_initialization/ and StackOverflow: https://forums.aws.amazon.com/thread.jspa?messageID=884638&#884638

like image 661
user3603567 Avatar asked Jan 10 '19 14:01

user3603567


People also ask

How do you get logs from fargate?

Collecting logs from ECS on Fargate using the awslogs driver You can configure your ECS task to use the awslogs log driver to send logs to CloudWatch Logs. To do this, either update your task definition to specify the awslogs driver or use the ECS console.

Why is my Amazon ECS Task stuck in the pending state?

Some common scenarios that can cause your ECS task to be stuck in the PENDING state include the following: The Docker daemon is unresponsive. The Docker image is large. The Amazon ECS container agent lost connectivity with the Amazon ECS service in the middle of a task launch.


2 Answers

  1. Go to the cluster and choose the Tasks tab
  2. In the lower pane, choose Stopped for the Desired Task Status value
  3. Locate the desired Task and click it's GUID
  4. Scroll down to the Containers section and expand the relevant containers that are experiencing errors

You'll see some kind of Status reason for the error. In my case it was:

CannotStartContainerError: API error (500): failed to initialize logging driver: Cannot determine region for awslogs driver

Edit: I can't really take credit for figuring this out - found it here:

https://github.com/aws/amazon-ecs-agent/issues/1654#issuecomment-437178282

like image 109
kking-biometrica Avatar answered Oct 19 '22 10:10

kking-biometrica


Try going to "CloudWatch -> Logs -> Insights" and click on "Run Query":

enter image description here

like image 30
Daniel Avatar answered Oct 19 '22 10:10

Daniel