Demandbase reduces AWS EMR costs by 70%
using EC2 spot instances
Running Big Data workloads is common in the Account-based Marketing industry and requires a large number of computing resources. Demandbase utilizes hundreds of resource-intensive instances to process hundreds of terabytes of data. As Demandbase became more successful, their user base grew substantially and their infrastructure had to scale accordingly. At this point, costs became a primary concern. Demandbase tried to address the rising costs by using managed-machine learning services but experienced large amounts of downtime and lacked customized tuning capabilities.
Solutions like AWS EMR make it easy to get started with a managed cluster running Big Data frameworks such as Apache Spark and Hadoop. Demandbase eventually made the switch to AWS EMR running on spot instances and was able to have full access to the Spark cluster and customize it as needed. Although using spot instances came with instant cost savings, the downside is that they can be terminated by the cloud provider with little notice. That type of behavior can jeopardize a cluster’s reliability and data consistency.
Demandbase was on the hunt for a solution that can provide DevOps automation for managing their large EMR infrastructure on spot instances. Their main concern was stability because the spot instance market often changes and they would lose nodes on a regular basis and have no way to understand the market, bidding process, and how to replace interrupted instances. Demandbase eventually found Elastigroup by Spot and was intrigued to see how it can help them automate their infrastructure and reduce costs.
With little effort and a few clicks in the Spot web console, Demandbase was able to import their existing AWS EMR deployments to Elastigroup. With the import complete, Demandbase was able to take advantage of the EMR integration to automate their infrastructure and reduce costs.
Demandbase found that Elastigroup’s EMR integration made wise and reliable use of their spot instances because of its built-in auto-scaling capabilities that scale task nodes up or down by learning the requirements, duration, workload patterns, and which compute resources would work best for them. Elastigroup finds the best combination of workload availability and price and will choose the instances that will be the most cost-effective and available for the job.
Besides being able to take a hands-off approach to scale their EMR infrastructure, Demandbase was able to further automate their administrative tasks because of how Elastigroup handles failures in the spot instance market. Elastigroup ensures that capacity won’t drop from the user-defined capacity settings by making use of Spot’s prediction algorithm in the spot instance market.
Elastigroup analyzes the Spot Market using machine learning and can predict interruptions up to 15 minutes ahead of time. When Spot instances are at risk for termination, Elastigroup will begin replacing them gracefully. Elastigroup provided Demandbase with the exact DevOps automation needed to begin using spot instances effectively on AWS EMR. Also, they no longer have to worry about trying to figure out the Spot Market themselves in terms of bidding for the best price, determining the best availability, and manually replacing interrupted instances.
After using Elastigroup, Demandbase was able to reduce their cloud computing bill by over 70%, have a more stable environment, automate Task and infrastructure scaling, customize and tune the EMR environment as needed, and automate managing their spot instances for the best combination of price and availability.