I'm trying to launch/run a Dockerfile on AWS using their ECS service. I can run my docker image locally just fine, but it's failing on the Fargate launch type. I've uploaded my Docker image to ECR, and I've created a cluster/service/task from it.
However, my cluster's task status simply reads "DEPROVISIONING (Task failed to start)", but it provides no logs or details of the output of my running image, so I have no idea what's wrong. How do I find more information and diagnose why ECS isn't able to run my image?
If you have trouble starting a task, your task might be stopping because of an error. For example, you run the task and the task displays a PENDING status and then disappears. You can view stopped task errors like this in the Amazon ECS console by viewing the stopped task and inspecting it for error messages.
Some common scenarios that can cause your ECS task to be stuck in the PENDING state include the following: The Docker daemon is unresponsive. The Docker image is large. The Amazon ECS container agent lost connectivity with the Amazon ECS service in the middle of a task launch.
For example, if you run a batch job with 1,200 On-Demand tasks, you can now launch your job in under a minute, while previously it would have taken about 20 minutes. Similarly, EKS Fargate customers will now observe up to 20X faster scaling when using the Platform Versions referenced in the release notes.
Please go Clusters > Tasks > Details > Containers
You could see some error message around the red rectangle in the figure "error message."
Task detail:
Error message:
As Abhinav says, the message isn't very descriptive (and using the CLI aws ecs describe-tasks
doesn't add anything more). The only possibility is to log into the host EC2 instance and read the logs there, or send those logs to CloudWatch https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_cloudwatch_logs.html#cwlogs_user_data
The mostly likely cause (in ECS) is that the cluster doesn't have enough resources to launch the new task. You can sometimes work out the cause from the Metrics tab, or since mid-2019 (depending on your region I guess) you can enable "CloudWatch Container Insights" from ECS Account Settings to get more detailed information about memory and CPU reservations.
I may be late to the party, but you can check the container logs instead of the tasks'.
Go to the failed task -> Details -> Container (at the bottom) and open it. Right under details you'll see a Status reason
.
Opening the container details
Getting the reason for failure
Note: if your task runs more than one container, check the 'Status reason' of each container as per the screenshot above, as it can be different between them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With