Traditional IT environments are limited, using a specific number of servers to handle loads for any given application. When the amount of requests increases, so does the load on the server. Eventually, the demand on the load causes degraded performance and failure. Amazon Elastic Compute Cloud (EC2) provides an Auto Scaling service that overcomes this challenge.
Auto Scaling makes sure there are enough EC2 instances to run applications. Before the service can run, you define auto scaling groups. For each group, you specify a minimal or maximum number of EC2 instances. Auto Scaling then detects if there is an error or failure on an instance, and immediately launches another instance to maintain the required capacity.
Amazon EC2 also offers dynamic auto scaling policies, based on load metrics, CloudWatch alarms, events from other Amazon services such as SQS, or a fixed schedule.
EC2 Auto Scaling is part of the AWS Auto Scaling service, which provides automatic scalability for several Amazon services.
There are three key components involved in EC2 Auto Scaling:
Groups organize EC2 instances into logical units, used for scaling or management purposes. When creating an auto scaling group, you can specify the minimum, maximum, and preferred number of EC2 instances you need.
The launch template is a new way to configure auto scaling, replacing launch configurations, which are still supported as a legacy option.
Launch templates specify configuration information for new instances created in an auto scaling group. This includes the Amazon Machine Image (AMI) to use when creating the instance, security groups, and key pair.
You can use versioning to create a subset of the set of parameters and reuse it to create additional launch templates. You can, for example, create a default template that specifies common configuration values, and programmatically insert different values that create new versions of the template.
EC2 Auto Scaling provides several ways to scale an instance group:
The EC2 instance in an autoscale group has a different lifecycle than other EC2 instances. The lifecycle begins when the auto scaling group launches instances, or an instance is manually added to a group. The lifecycle ends when an instance ends or the group removes an instance and terminates it.
Source: AWS
Several events, known as “scale out events”, initiate a process that tell the auto scaling group it should launch new compute instances and add them to the group:
When one of these events happens, the auto scaling group creates new instances, using the group’s launch configuration. New instances are initially launched in Pending status, and you can add lifecycle hooks to automatically perform an action when they are created.
After an instance is created and any lifecycle hooks are executed, it enters the InService status. It remain in this state until any one of the below events occur:
The following “scale in” events cause an auto scaling group to remove an instance from the group and destroy it:
Be sure to define a scale-in event for every scale-out event—to prevent unchecked scaling and instance sprawl.
Here are several best practices that can help you manage EC2 scaling more effectively.
Ensure Amazon EC2 Auto Scaling is defined on load metrics that have a frequency of one minute. This enables a faster response to changes in application usage. Using a scaling metric with frequency of five minutes slows response time, and can result in scaling events based on old data.
By default, EC2 provides basic monitoring, which tracks metrics every 5 minutes. For Auto Scaling based on EC2 metrics, it is recommended to enable detailed monitoring, which updates metrics every minute. Note this incurs an additional charge.
Make sure that the health check feature is configured correctly to detect that EC2 instances registered with an auto scaling group are functioning normally. Otherwise an auto scaling group cannot perform basic functions like removing and replacing failed instances.
If you are using Amazon Elastic Load Balancer (ELB) to distribute traffic between instances in an auto scaling group, make sure that ELB health checks are enabled (this works at the hypervisor and application level).
Predictive scaling uses workload forecasting to plan future capacity. Predictions will be of higher quality if workloads have a cyclical performance pattern. Try running predictive scaling in “forecast only” mode, to evaluate the quality of the predictions and scaling actions the policy generates. If you are satisfied with the predictions, set the policy to “forecast and scale”.
If you don’t have any other monitoring mechanism for auto scaling, make sure your auto scaling group is configured to send email notifications upon scale out or scale in events. When notifications are enabled, an AWS SNS topic associated with the auto scaling group receives scaling events and sends notifications of scaling events to the email address you specified during the setup process.
Elastigroup provides AI-driven prediction of spot instance interruptions, and automated workload rebalancing with an optimal blend of spot, reserved and on-demand instances. It lets you leverage spot instances to reduce costs in AWS, even for production and mission-critical workloads, with low management overhead.
Key features of Elastigroup include:
Complete access
for up to 20 instances