How to deal with a stuck ECS deployment?

Kacper Bąk
1 min readAug 3, 2024

AWS ECS (Elastic Container Service) is a powerful tool for containerized applications, but deployments often have a problem, causing DevOps to bang their head against the desk.

TL;DR: Something has probably gone wrong if an ECS deployment is not completed within 10–15 minutes.

Why do ECS deployments have a problem?

ECS deployments often hang if CloudFormation does not update or roll back correctly. They are looking for stability but are unable to achieve it. This results in the stack being left in a UPDATE_ROLLBACK_IN_PROGRESS or UPDATE_ROLLBACK_FAILED state, which is not well communicated.

The most common causes

  • Invalid or corrupted images in your image registry.
  • Lack of processor or memory.
  • Load balancer problems.
  • Problems with the ECS container agent.
  • Incorrect security group or VPC configuration.

In my case, the solution to this problem, after analyzing all the previous things, turned out to be, delete and re-create. Starting from scratch with ECS seems the easiest solution… Really…

You can also open the ECS console, and try to set the Number of tasks to 0.

  • Open the ECS console.
  • Select the cluster and service.
  • Click Update and set the Number of tasks to 0.

More details can be found in the AWS ECS documentation.

--

--

Kacper Bąk
Kacper Bąk

Written by Kacper Bąk

Software Engineer & Backend Developer

No responses yet