An EC2 auto scaling group is a logical collection of several Amazon EC2 instances used for management and scaling purposes. Auto scaling groups let you use core features of the EC2 Auto Scaling service, including health checks, minimum/maximum instances and scaling policies.
EC2 Auto Scaling functionality mainly centers around maintaining a specified amount of instances in a group, or automatically increasing and decreasing the size of a group according to different aspects, such as application loads or predetermined schedules.
EC2 Auto Scaling is offered as part of the general AWS Auto Scaling service, which can help you scale multiple Amazon services, including ECS and RDS.
The size of your auto scaling group is maintained according to a pre-defined number of instances, which you configure as the required capacity. You can use manual or automatic sizing to resize groups according to application requirements.
Initially, an auto scaling group launches enough instances to reach the required capacity. By default, it maintains this amount of instances by performing regular health checks, identifying unhealthy instances, terminating them and launching other, replacement instances.
Beyond this basic functionality, you can use a scaling policy to dynamically change group size, using policies for the following jobs:
Auto scaling groups use launch templates to define which new instances will be launched. You can define a specific instance type, or several types of instances, in a launch template. You can also set different launch templates for different EC2 resources.
Learn more in our article about EC2 auto scaling.
You can span auto scaling groups across multiple availability zones (AZ) in a Region. The next step is to attach a load balancer, which distributes incoming traffic equally across all of your chosen AZs.
Once an AZ becomes unavailable or unhealthy, Auto Scaling launches and adds new instances to the AZ with the least amount of instances. If or when these attempts fail, Auto Scaling tries to launch these instances in other AZs and continues this process until it succeeds.
When you need to expand the availability of your application, you can add an AZ to your auto scaling group. Remember to enable the AZ for the relevant load balancer. After the new AZ is enabled, the load balancer starts routing traffic, in an equal manner, across all enabled AZs.
Note that while auto scaling groups can contain instances from multiple AZs, all AZs must be located in the same Region. This functionality does not support the use of multiple Regions.
Here are other limitations you might want to consider when choosing AZs:
Auto scaling groups let you launch a fleet composed of on-demand EC2 instances and spot instances. There are multiple ways to get discounted rates for instances while auto scaling:
Mixing different types of instances also improves availability:
Learn more about mixing EC2 Autoscaling Groups and spot instances
There are several allocation strategies used by auto scaling groups to add specific types of instances to a group:
Even though spot instances can be mixed up with on-demand instances in an auto scaling group, the spot instances can still be interrupted. When a Spot instance is at high risk of interruption, EC2 provides an instance rebalance recommendation.
Amazon explicitly warns that it will not always send this recommendation in advance, and it may arrive together with the spot instance interruption notice, which gives you two minutes to move workloads off the instance before it is interrupted.
Once you receive a rebalance recommendation or an interruption notice, you can choose to:
You can enable an option called Capacity Rebalancing for an EC2 auto scaling group. This means that Auto Scaling will automatically attempt to replace a spot instance that received a rebalance recommendation, with another instance that has not received this warning. Alternatively, you can create a lifecycle hook to perform any other action when the warning is received.
Learn more about predictive rebalancing for spot instances
You can use tags to classify auto scaling groups—for example to indicate their purpose, the environment they run in, or their owner. You can add several tags to a single group, and specify that tags should also be applied to the individual EC2 instances inside the group. This can help break down instance costs in your EC2 bill.
Amazon Elastic Load Balancing (ELB) is used to automatically distribute incoming traffic to EC2 instances, so that an instance is not overloaded. You can register an auto scaling group with a load balancer. This allows load balancing to occur between all the members of the auto scaling group.
When an auto scaling group is attached to a load balancer, all incoming requests go to the load balancer, and it routes the traffic to one of the instances in the group. It is important to realize that instances added to or removed from the auto scaling group need to be explicitly registered or deregistered from the load balancer.
Once you've attached a load balancer to an auto scaling group, you can configure the group to scale based on ELB metrics, such as the number of requests per target. You can also use health checks performed by the load balancer to trigger scaling events.
You can use the following types of load balancers with EC2 auto scaling groups:
Your load balancer and auto scaling group need to be in the same region. Additionally, your load balancer target must be an instance and not an ip, in order to connect to an auto scaling group.
Elastigroup provides AI-driven prediction of spot instance interruptions, and automated workload rebalancing with an optimal blend of spot, reserved and on-demand instances. It lets you leverage spot instances to reduce costs in AWS, even for production and mission-critical workloads, with low management overhead.
Key features of Elastigroup include:
for up to 20 instances