Provision and manage Elastigroup with the new AWS Quickstart Guide

Karan Shetty

Product Architect

January 23, 2023

4 min read

Amazon Web Services CloudFormation Elastigroup

Elastigroup, Spot by NetApp’s first product and core technology, can now be easily provisioned and managed as an AWS Quickstart Guide, created in collaboration with the AWS QuickStart team.

Elastigroup for AWS is an IaaS Optimization platform in which the user can provision, manage, and scale EC2 instances to support any elastic application or load balanced workload while leveraging multiple instance–purchasing options. It seamlessly integrates with several AWS services, such as ALB/ELB, ASG, Beanstalk, Route53, Opsworks, CodeDeploy, Chef, and EMR.

To get started, you just need to sign up with us and connect your AWS account wherein all your AWS resources will be deployed using CloudFormation (CFN). You will be automatically enrolled in a Freemium plan, and there won’t be any charges from NetApp to the workloads deployed in the guide.

The deployment guide provides a CFN template to deploy an Elastigroup in a new/existing VPC:

Elastigroup manages these highly available EC2 web servers, which connect to an ALB and span over three Availability Zones. With Elastigroup, your workloads can see up to a 90% EC2 discount by utilizing spot instances and ensures any unused reservations or Savings Plans are fully utilized before launching spot instances. Elastigroup has Intelligent Traffic Flow (ITF) and predictive autoscaling enabled by default. The template also has scheduled actions enabled, which takes down all the deployed web servers on weekends and makes sure everything is back up during your workdays.

For detailed deployment steps, refer to the AWS Quickstart Guide.

To achieve the greatest savings, you need to deploy and manage spot instances. The best practice for highly available spot instances is adding more instance types/AZ’s. But adding more instance types with different sizes adds more complexities and strain on your workloads, and most of the time you end up in scenarios where your EC2 instances are either underutilized due to large sizes or over-utilized small instances.

To avoid these scenarios, Elastigroup uses predictive auto-scaling to eliminate any over/under-utilized instances. This feature predicts the pattern of your workload using a machine learning algorithm and makes sure the instances are maintained at the Target CPU threshold, paired with Intelligent Traffic Flow to make sure the traffic is evenly distributed among the instances based on their size.

Features enabled in the deployed Elastigroup:

1. Elastigroup predictive autoscaling

Elastigroup predictive autoscaling uses a machine learning algorithm to accurately predict the CPU utilization pattern of your workloads and increase the number of instances based on the projected CPU utilization. Predictive autoscaling also helps in cases where your instances/applications take a lot of time to boot up. Predictive autoscaling can scale the instances in advance of the actual traffic and thus saving up to 30 minutes of startup time.

2. Elastigroup ITF

Elastigroup ITF will intelligently manage and control incoming traffic for optimal instance utilization and high performance. Intelligent Traffic Flow (ITF) is a software layer connected to Elastigroup that appropriately manages and controls the distribution of incoming traffic for optimal instance utilization. Users can select as many instance types and sizes as they want while ITF ensures that traffic is evenly spread across instances, according to their size.

Before the development of ITF, users of autoscaling workloads were often confined to using a single instance size if they are scaling their workloads based on metrics. For example, if you are scaling an ASG on CPU utilization and you were to have t2.nano and a c5.metal server within your group, the scaling decisions are not proportional to the load these very different instances can handle. Not only will Elastigroup scale the best instance type, size, and family at that given moment based on real-time spot market conditions, but Elastigroup will also dynamically manage the target groups on your ALB to adjust to the changing load.

Test case scenario

To test real-time workloads, you can use the user data below, which mimics a spiky workload by installing the stress test utility and adds two CPU bound tasks every 30 minutes with a timeout of 10 minutes:

#!/bin/bash

yum update -y

yum install -y httpd.x86_64

amazon-linux-extras install epel -y

yum install stress -y

service httpd start

echo "Welcome to Elastigroup tutorial." > /var/www/html/index.html

echo "#!/bin/bash" >> /home/ec2-user/stress.sh

echo "sudo stress --cpu 2 --timeout 600s" >> /home/ec2-user/stress.sh

chmod +x /home/ec2-user/stress.sh

echo "0,30 * * * * /home/ec2-user/stress.sh" >> /home/ec2-user/stress-test

crontab /home/ec2-user/stress-test

You can insert the above user data by editing the already deployed CFN template and setting the ShouldRoll parameter to True. You can also do so from the UI by clicking on Edit Elastigroup configuration, insert, and then Save the user data. By clicking on Deploy under Actions, you can roll over the existing EC2 instances with the updated user data.

After a few hours of running EC2 instances, the machine learning algorithm will predict the pattern of the spikes in your workload and make sure EC2 instances are available before the actual traffic comes in, thereby reducing the stress on existing instances.

By deploying the AWS Quickstart Guide and using the above test scenario, you can clearly observe in real time that Elastigroup can smartly scale and manage your EC2 workloads of different sizes and with the least user effort required. It also ensures the managed EC2 instances have high savings and availability using spot/available reserved instances.

Get started with the new Elastigroup Quickstart Guide.