AWS Spot Fleet: The First Step to Saving with Spot Instances

What is AWS Spot Fleet?

Amazon Web Services (AWS) Spot Fleets are collections of AWS spot instances, virtual servers from Amazon’s pool of spare capacity, offered at discounts of up to 90%. Spot instances need to be carefully managed, because they are terminated by Amazon at short notice when the market price goes about your bidding price.

Applications can make requests for Spot Fleets via the Spot Fleet application programming interface (API) or the command line interface (CLI). Because spot instances pricing often changes, EC2 constantly attempts to balance the capacity according to predefined values.

To request a Spot Fleet, you need to define a maximum price per instance hour and the desired target capacity. You are also asked to input launch specifications, defining instance types, number of required instances, and choosing availability zones.

When AWS needs to draw from the spare EC2 capacity to service customers who purchased reserved instances, savings plans or on-demand instances, they will terminate the spot instances and launch replacement instances.

When fulfilling a Spot Fleet request, AWS Spot Fleets automatically chooses the lowest-priced instances. Once Spot Fleets are enabled, the system keeps automatically managing all provisioned instance collections.

This is part of our series of articles about AWS Auto Scaling.

In this article, you will learn:

What Problems Does Spot Fleet Solve?
How Spot Fleet Requests Work
Spot Fleet Auto Scaling
Spot Fleet Limitations
AWS Spot Instance Management with Spot

What Problems Does Spot Fleet Solve?

The basic problem with spot instances is that they can be terminated at short notice. This can be problematic for many types of applications, in particular mission critical applications. The application must be able to withstand the failure of any component running on a spot instance, smoothly moving that component to another spot instance or an on-demand instance.

Another challenge is that spot instance availability is very volatile. If the organization specifies one instance type and availability zone for its spot instances, there is a high risk that specific type of spot instance will not remain available as long as needed. However, by mixing instance types and offering several possible availability zones, there is a bigger chance of maintaining spot instance availability.

The idea behind Spot Fleet is to allow Amazon users to:

Use a mix of spot instances, with different instance types and drawing from different availability zones
Use spot instances, but also provide on demand instances as a backup, in case spot instances are not available

Spot Fleets also allows developers to specify scripts that should run in the event of instance termination. Previously, this would be done by a controller application, which represents a single point of failure. Now, developers can hand over management of spot instance termination to an Amazon service.

How Spot Fleet Requests Work

To create a Spot Fleet, you need to issue a Spot Fleet request, specifying:

How many instances you need (target capacity)
How many on-demand instances (optional)
Launch specifications, including instance type, AMI to use, availability zone, and security groups

There are two types of Spot Fleet requests:

Request creates a spot fleet on a one-time basis
Maintain creates a Spot Fleet and maintains a desired capacity on an ongoing basis, identifying and recovering failed instances

Before making a Spot Fleet request, you should carefully consider which instance types will meet your application requirements, and what portion of the fleet should be on-demand instances, to ensure that in case spot instances are terminated, at least some of your instances will remain active.

Spot Fleet Auto Scaling

Auto scaling lets you automatically increase or decrease a Spot Fleet’s target capacity based on current demand. Spot Fleet can launch additional instances (scale out) or terminate instances (scale in) within a specified range, based on one of several scaling strategies.

Spot Fleet supports the following types of auto scaling:

Target tracking—increase or decrease capacity according to the target value of a specific indicator. For example, you can specify that CPU utilization on instances in the Spot Fleet may not be higher than 80%.
Step scaling—increase or decrease capacity of the fleet according to a series of gradual adjustments, triggered by a CloudWatch alarm.
Scheduled scaling—increase or decrease capacity by a predetermined amount at a specific date and time.

A few important points about how Spot Fleet auto scaling works:

Spot Fleet uses cooldown period for scaling events. This means that for a few seconds after a scaling event, it waits for system indicators to adjust, and does not perform additional scale out events. For example, after scaling out, it waits a few seconds for CPU utilization to adjust before determining if it needs to scale out further.
If an additional CloudWatch alarm is triggered during the cooldown period, a scale out or scale in event is performed immediately.
Amazon recommends using instance metrics with a 1-minute frequency for scaling events, to enable faster response to application demand. This requires enabling detailed monitoring in CloudWatch, which incurs an additional cost.

Learn more about EC2 Auto Scaling

Spot Fleet Limitations

AWS Spot Fleet is a major improvement over previous methods of managing spot instances, but it is still lacking.

Here are key features related to spot instance management which Spot Fleet does not provide.

No guaranteed, early prediction of instance termination—even with Spot Fleet, spot instances are terminated with 2 minutes notice. While Amazon provides a “rebalancing warning”, it does not guarantee the warning will be sent more than 2 minutes before termination. This means you cannot reliably migrate workloads to another instance with sufficient time.
Does not guarantee storage persistence—if a spot instance is terminated, Spot Fleet does not guarantee the new instance can attach to the old EBS volume. This depends on availability of EBS volume capacity in the same instance pool, and requires that the spot request is defined as “persistent”.
Limited support for containers—container auto scaling requires advanced configuration, does not support container right sizing.
Cannot automatically fallback to on-demand—Spot Fleet does not let you automatically failover to an on-demand instance when a spot instance fails.
Does not support proactive instance auto-recovery—Spot Fleet only recovers an instance after the spot instance was terminated, and only for “maintain” requests.
Does not guarantee IP persistence—Only if the instance or fleet is defined as “persistent” or “maintain” respectively is IP persistence supported.

AWS Spot Instance Management with Spot

Spot allows you to reliably run even mission-critical and production workloads on EC2 spot instances. With predictive rebalancing algorithms, spot instance interruptions are detected up to an hour in advance with proactive replacement of the at-risk instance(s). In the event that there are no available spot instances, fallback to on-demand instances will occur, thereby guaranteeing workload continuity. If there are available, unused Savings Plans or RIs, Spot will leverage those before spinning up new spot instances to ensure maximum cost-efficiency.

Learn more about using spot instances for both non-containerized workloads as well as for containerized and Kubernetes workloads.