AWS Auto Scaling is an Amazon service that lets you configure automatic scaling of AWS resources. It increases computing power or storage resources available for applications when loads increase, and reduces it when no longer needed.
The AWS Auto Scaling Console provides a single user interface to use the auto scaling capabilities of various AWS services. AWS Auto Scaling can be used to scale Amazon Elastic Compute Cloud (EC2), EC2 Spot Fleet requests, Elastic Container Service (ECS), DynamoDB, and Amazon Aurora.
AWS Auto Scaling enables you to configure and manage scalability using scaling strategies—define how to optimize resource usage—preferring availability, cost, or a balance of the two. It is also possible to create custom scaling strategies.
You can also leverage scaling plans—these are policies that adjust resources using dynamic or predictive scaling.
Let’s briefly review how AWS Auto Scaling can help you manage scalability for common AWS services.
Helps you maintain the number of EC2 instances your application needs to handle incoming traffic requests.
You can create EC2 auto-scaling groups, a collection of EC2 instances. Set a minimum scaling value so that the group is never smaller than the specified size (if an instance fails, it is replaced). Set a maximum number of EC2 instances and the group will not exceed the specified size.
A spot instance is an Amazon EC2 instance provided at a discount of up to 90%, because Amazon currently has spare capacity of this instance type in a specific availability zone. Spot instances can be interrupted with two minutes’ notice.
A Spot Fleet is a grouping of EC2 spot instances, based on custom criteria. Spot Fleets are created by spot fleet requests, which specify how much capacity is needed, how much of it should be made up of on-demand instances, which types of spot instances are required, and a maximum price.
There are two types of spot fleet requests:
AWS Auto Scaling can automatically adjust the capacity of a Spot Fleet, based on demand. It supports the following scaling policies:
This can be triggered by CloudWatch metrics available for ECS containers, like CPU and memory usage. AWS Auto Scaling automatically increases or decreases capacity of ECS container tasks. To handle a large volume of incoming requests, use CloudWatch metrics to add more tasks, or remove tasks when loads decrease.
ECS auto scaling can also use scaling plans like step scaling and scheduled scaling (see scaling plans).
RDS auto scaling provides automated storage scaling for MySQL, PostgreSQL, MariaDB, SQL Server, and Oracle databases. RDS monitors database storage utilization, and when current usage is close to the provisioned size, it scales up storage capacity available to the database instance.
Scaling events are performed with no downtime, without affecting current database operations or interfering with current transactions.
In DynamoDB database workloads, it is challenging to estimate required read and write capacity. Applications may require a high throughput for only a short time. DynamoDB Auto Scaling dynamically adjusts capacity based on actual inbound traffic patterns.
When workload throughput decreases, Auto Scaling automatically decreases the number of capacity units, avoiding payment for unneeded capacity.
DynamoDB Auto Scaling works by creating scaling policies for the table or secondary index. In the scaling policy, you can specify whether to extend read and/or write capacity, maximum and minimum provisioned capacity units, for the table or the index.
Scaling plans are a key component of AWS Auto Scaling. It provides a set of instructions for scaling resources up and down. If you use AWS CloudFormation or add tags to your AWS resources, you can set up a different scaling plan for each group of resources.
AWS Auto Scaling analyzes the behavior of each resource and provides recommendations for customized scaling strategies. After a scaling plan is created, Auto Scaling executes it by combining dynamic scaling and predictive scaling methods:
Source: Amazon Web Services
Configure Auto Scaling to maintain a specified number of instances indefinitely. Amazon EC2 Auto Scaling periodically scans instances to check their health. When an error is detected, the instance is terminated and a standby instance is started. This ensures the required number of instances is running.
You can schedule scaling to occur automatically on specific dates and times. This feature is especially useful in situations where you can accurately forecast demand. Instead of relying on predictive scaling, you manually determine how much capacity to allocate at a given time. This is useful when there are unusual, known spikes in demand, for example before a holiday sale.
AWS Auto Scaling can scale resources according to actual application loads. Ensure you select a load metric that is representative of how your resources respond to loads—typically CPU or memory utilization are good metrics. When loads shift, Auto Scaling will increase or decrease resources to ensure the load metric stays at the same level.
Elastigroup provides AI-driven prediction of spot instance interruptions, and automated workload rebalancing with an optimal blend of spot, reserved and on-demand instances. It lets you leverage spot instances to reduce costs in AWS, even for production and mission-critical workloads, with low management overhead.
Key features of Elastigroup include:
for up to 20 instances