Granular Control Over Scale-Down For Mission-Critical ECS Services

Managing highly efficient ECS clusters requires balancing between overprovisioning, which drives costs up, and underprovisioning, which result in performance issues and pending tasks.

Fortunately, Ocean by Spot provides ECS users with an automated, serverless experience. Its proprietary autoscaler leverages just the right blend of type, size, and lifecycle of container instances, for an optimally utilized cluster. 

However, within a cluster, there could be some Tasks that should not be rescheduled even if the underlying node is underutilized and can be scaled down (calculation jobs for example). To fine-tune the automated scale-down performed by Ocean on underutilized nodes, we are happy to introduce a new feature for Ocean’s ECS Autoscaler: Restrict Scale Down

This provides optimized support for use cases with longer loading times or run duration, which should not be interrupted by a reschedule action. Customers can now have more control over the behaviour of their cluster, all without the need for additional configuration on the infrastructure end of things.

How does it work?

The reason Ocean can achieve extraordinarily high cluster efficiency is twofold. First, a proprietary “Tetris Scaling” scale up process matches Task requirements to instance specifications and avoids overprovisioning, and second, it’s scale down behavior:
Ocean monitors the cluster and runs bin-packing algorithms that simulate different permutations of Task placement across the available container instances. A container instance is considered for scale down when all of it’s running Tasks are schedulable on other instances.

When an instance is chosen for scale-down it will be drained. Its running Tasks are rescheduled on other instances, and the instance is then terminated.

Now, customers can use the tag spotinst.io/restrict-scale-down on ECS task definitions or their Services, to prevent the scale down of a Container Instance running those Tasks.

The following is an example of the tag that can be added to a Task Definition or a Service:

"tags": [
   {
     "key": "spotinst.io/restrict-scale-down",
     "value": "true"
   }
 ]

Using the above tag will ensure the Container Instances running the matching Tasks will not be considered by Ocean’s Autoscaler for scale down purposes (note that instances may still be replaced if they are unhealthy, or in case of a user initiated deployment).

For more information about tagging in ECS, see Tagging Your Amazon EC2 Resources.