Fargate Autoscaling: A Practical Guide

What Is AWS Fargate Autoscaling? 

Fargate is an AWS serverless compute engine for containers that enables you to run Docker containers without managing any underlying infrastructure. With AWS Fargate, you can concentrate your efforts on building and deploying your applications, without having to worry about the infrastructure needed to run them.

Automatic scaling is a feature that lets you automatically adjust the number of Fargate tasks running in response to changes in demand for your application. It helps you to optimize the use of your resources and ensure that your application is always available and responsive.

In this article:

Why Is AWS Fargate Autoscaling Important? 

AWS Fargate autoscaling is important because it enables users to automatically adjust the capacity of containerized applications in response to changing demand, without the need for manual intervention. This is important because it provides several key benefits:

  • Automatically adjusting to demand: Fargate autoscaling allows users to easily scale their applications up or down to respond to changes in demand, without the need to manually adjust the capacity. This enables applications to handle spikes in traffic and maintain performance during periods of high demand, while also ensuring that resources are not wasted during periods of low demand.
  • Cost savings: It can help reduce costs by automatically adjusting the number of running containers based on demand. This means that users only pay for the resources they need, and they can avoid over-provisioning resources that are not being used.
  • Increased resilience: It can improve the resilience of applications by automatically adding or removing capacity in response to failures or outages. This helps to ensure that applications are always available and can recover quickly from disruptions.
  • Simplified management: It simplifies the management of containerized applications by automating the process of scaling capacity up or down. This means that users can focus on developing and deploying their applications, rather than managing the underlying infrastructure.

How Fargate Autoscaling Works

The Amazon Application Auto Scaling service helps you to automatically scale resources for your applications based on predefined scaling policies. It can be used with a wide range of AWS services, including Amazon ECS, to automate the scaling of resources for your containerized applications.

When using Amazon ECS with Fargate, you can configure Application Auto Scaling to automatically adjust the desired number of tasks for your Fargate service based on CloudWatch metrics. CloudWatch is a monitoring service that collects and tracks metrics, logs, and events from various AWS resources, including Fargate tasks.

To use CloudWatch with Amazon ECS, you first define a CloudWatch alarm that monitors a metric, such as CPU utilization or network traffic, for your Fargate service. You then configure autoscaling to use that alarm to trigger scaling actions.

When the metric breaches a given threshold that you have defined, Application Autoscaling automatically scales your Fargate service by increasing or decreasing the desired number of tasks. This helps you to optimize the use of your resources and ensure that your application is always available and responsive.

Fargate Automatic Scaling Policies

There are two main types of automatic scaling policies that can be used with Fargate: target-tracking policies and step-based policies.

Target-Tracking Policies

Target-tracking autoscaling policies are a type of scaling policy that allow you to automatically adjust the number of resources, such as Amazon EC2 instances, based on target values for specific metrics. The objective is to keep the metric value as close as possible to the specified target.

When creating a target-tracking autoscaling policy, you start by specifying a metric to track and the target value for that metric. The policy will then automatically scale up or down the number of instances based on whether the metric has exceeded the target value, but not if it is below the specified target. However, the policy won’t scale if the metric lacks sufficient data, to avoid making incorrect scaling decisions.

It’s important to note that there can be gaps between the target and actual values due to rounding up or down, which can affect the number of instances that are launched or terminated. Additionally, the policy scales up immediately but scales down gradually to avoid abrupt changes that could cause instability in the system.

When using target-tracking autoscaling policies, it’s important not to delete or modify CloudWatch alerts managed by Service Auto Scaling, as these are critical for the proper functioning of the policy. These alerts monitor the metrics that the policy is based on and trigger the scaling actions.

Step-Based Policies

A step-automated scaling policy allows you to scale resources based on a set of predefined steps. With this type of policy, you specify a CloudWatch alarm that monitors a metric, such as CPU usage or network traffic, and then define a set of step adjustments that determine how many instances to remove or add when the metric breaches a certain threshold.

To create a step autoscaling policy, you need to specify the step adjustments that determine the scaling amount based on the type of scaling adjustment. For example, you can specify the metric values that define the lower and upper bounds for the scaling adjustment, as well as the number of instances to add or remove when the metric breaches those thresholds.

While step autoscaling policies can be effective for certain use cases, they require more manual configuration and tuning compared to target-tracking autoscaling policies. Target-tracking policies, as mentioned earlier, are easier to set up and manage because they automatically adjust the number of resources based on the specified metric’s target value. 

In contrast, step autoscaling policies require you to define a set of steps, which can be time-consuming and error-prone. Additionally, step adjustments may not be as precise as target-tracking policies because they only allow you to scale based on predefined steps rather than a specific target value. This can lead to suboptimal resource allocation and increased costs.

How to Configure Amazon ECS Service Autoscaling on Fargate 

To enable Service Auto Scaling while creating or updating a service in the Amazon ECS console, follow these steps on the Set Autoscaling page:

  1. Select Configure Service Autoscaling to modify your service’s desired count.
  2. Under Minimum number of tasks, input the lowest number of tasks you want Service Auto Scaling to maintain.
  3. For Desired number of tasks, specify the optimal number of tasks for Service Autoscaling.
  4. Set the Maximum number of tasks to the highest number of tasks allowed for Service Auto Scaling.
  5. For the IAM role for Service Auto Scaling, choose ecsAutoscaleRole.
  6. In the Automatic task scaling policies section, select Auto Scaling Policy.
  7. Complete the remaining steps in the setup wizard to create or update your service.

Optimizing AWS Fargate Costs with Spot by NetApp

Automatically give containers the optimal infrastructure and make sure your clusters always have the resources they need. 

Ocean by Spot continuously monitors and optimizes container infrastructure to maximize efficiency and availability while minimizing costs, helping CloudOps teams focus on their workloads and applications rather than be burdened by management of their container infrastructure.

Learn more about how Spot by NetApp can help you optimize AWS Fargate costs