In this post, we will guide you how to use Amazon EC2 Spot instances within Auto Scaling groups using “Spotinst for AutoScale” feature.
Auto Scaling Groups let you scale your Amazon EC2 capacity up or down automatically according to conditions you define on your CloudWatch metrics (CPU based, Request Count based, etc.).
The change in CloudWatch metrics invokes the CloudWatch alarm to perform an action. The action is a message sent to either the scaling-in policy or the scaling-out policy.
When we came across this elastic scale feature, we wanted to help our customers to maintain a baseline group on On-Demand instances, and to architect it in a way which every scaling activity will lunch Spot servers first (instead of On-Demand) and also to provide a layer which ensures protection in a situation where a spot-type resource is not available (due to out Bid or server evacuation).
Use Spot Instances with Auto Scaling groups
In order to use Spot Instances with Auto Scaling groups, you must consider the following:
- Choose the right spot instance types and availability zones. (Based on prices).
- Placing the right bid – How to pay the minimum while gaining the maximum availability.
- How to combine both Spot instances and On-Demand instances within Auto Scaling groups.
- How to combine multiple instance types with Auto Scaling group.
- How to set up correctly auto scaling policies and CloudWatch alarms to ensure that scaling activities will come always (or firstly) from Spot instances.
- How to guarantee that whenever Spot instances are not available – use On-Demand instead.
The goal is to provide the maximum availability using a multi-availability zone deployment, and diverse distribution of instance types for Spot market redundancy.
When creating a Spotinst group you must specify:
- Base instance type: single type
- Scaling instance types: an array of types
- Incoming traffic method: Elastic Load Balancer or Spotinst Proxy
- Balance load: Round Robin or Utilization
NOTE: Spotinst Proxy enables you to choose the balancing mode. Balance load based on percentage of CPU usage (utilization) or Round Robin.
For example:
- Base instance type:
c3.large
. - Scaling instance types:
c3.large
,c4.large
,r3.large
,m3.large
,c4.xlarge
. - Incoming traffic method: Elastic Load Balancer.
- Balance load: Round Robin (automatically). A base multi-availability zone On-Demand Auto Scaling group will be created. This Auto Scaling group should have all available availability zones in the region.
NOTE: You can select multiple instance types, with different number of cores.
After creating a base Auto Scaling group using the desired instance type and launch config we will configure the Spot instances scaling activities.
From your Spotinst Dashboard please navigate and choose “Amazon Web Services” -> “Auto Scaling” -> “Auto Scaling Group”:
Select your relevant Auto Scaling group from the list:
At this step, Spotinst will calculate which additional groups, and the diversification for the cartesian product between availability zones and instance types. Spotinst will offer a complete stack of spot scale activity.
The complete stack includes:
- Base On-Demand Auto Scaling Group with Scale-up and Scale-down policies and CloudWatch alarms.
- A set of Spot Auto Scaling Groups according to your chosen instance types and availability zones.
- Spotinst will determine the best bid automatically for each instance type in every availability zone.
- Spotinst will choose only the worthwhile instances which have the lowest price over the past day, week, month, and 3 months.
- In order to provide the ability to run your Auto-scaling groups for a long period of time, Spotinst’s Optimizer will monitor the prices in real time and will determine whether the group is profitable or not.
- Spotinst knows to take under consideration spot market anomalies, for example, if a specific instance type from a given availability zone is running “too much” hours with a price that is higher than it’s on Demand price, Spotinst will eliminate this instance type.
Refer to the yellow line in the graph:
NOTE: If an instance type from a given availability zone has abnormal spikes in prices, Spotinst will ignore it as well.
Refer to the blue line in the graph:
NOTE: Whenever an instance type from a given availability zone becomes not profitable (as shown in the red line in the graph).
Spotinst recognizes it and gradually terminates all the instances from this specific market while spinning other spot instances from other markets.
Spotinst calculates the relevant price and availability (a rank which Spotinst grants for each Instance Type
x Availability Zone
x Product
).
In the shown example, Spotinst has chosen 3 availability zones out of 4 for c3.large
in the region us-east-1
.
The most challenging thing when setting up the stack manually is to set correctly as well as tracking the scaling activities and the cloud watch alarms thresholds.
It’s getting even complicated when your base On-Demand group runs with 2 cores per instance, and another Spot Auto Scaling Group is running 4 cores per instance.
For example, a base on-demand group of c4.large
, and 3 Spot Auto Scaling Groups ofc3.large
, r3.large
and c4.xlarge
. Wrong values in the scaling policies and wrong cloud watch alarms correlation will cause a complete disorder in the whole scaling activity.
Spotinst makes an order by adjusting and correlating the Cloud Watch alarms according to the base Auto Scaling Group, the Instance Types, and your running workload.
After Pressing “Add”, Spotinst automatically provisions the most profitable instance types in the relevant availability zones and starts to monitor and optimize it while it is running.
Eventually, Spotinst reflects that in the console. Tracking the number of hours running and the money spent, and of course the saving percentage with respect to on-demand usage.
As stated, Spotinst Optimizer keeps running and monitors every few seconds the group’s status. In case of spot market failure or spot market abnormal changes that cause a part of the group to become not profitable, Spotinst automatically fix that by replacing the instances gracefully.