Ticketmaster Accelerates AWS Adoption While Significantly Reducing Costs With Spot

Ticketmaster is a leading ticket sales and distribution company located in California that sells millions of tickets for a wide variety of events all around the world. In 2017, they publicly announced their migration to Amazon Web Services to be able to quickly scale their Kubernetes infrastructure to accommodate the use of their platform that quickly balloons and recedes with the moves of big-name shows. After moving to AWS, their developers were able to write software faster and respond to customers demands quicker. As an organization moves to a new platform, scaling and management become an important issue.

Scaling to handle the number of requests which Ticketmaster’s applications receive means that costs can rise quickly. Kubernetes environments can, naturally, help to reduce these costs but an enterprise the scale of Ticketmaster must always maintain a keen eye on optimization wherever possible.

An organization can only scale to the breaking point of the underlying infrastructure and Ticketmaster was looking for a way to simplify management because too much time and focus were lost from other areas that developers could be working on. Ticketmaster needed a solution that can automate infrastructure and pod scaling, integrate with their existing CI/CD workflow, manages SLA, and reduce their operating costs.

Working with spot instances & Elastigroup by Spot

After Ticketmaster noticed their cloud computing costs rise, they started to look for ways to reduce spending and came across spot instances.  When researching EC2 Spot, they realized that they can significantly reduce costs but there was not an easy way to manage the spot instances properly because they can be terminated with a little notice. Shortly after their research, Ticketmaster discovered Elastigroup by Spot and realized that it can automate infrastructure and pod scaling, integrate with their existing CI/CD workflow, manage SLA, and reduce their operating costs.

Elastigroup is similar to Auto Scaling Groups, a service that allows you to deploy and scale EC2 Instances reliably and efficiently, maintaining SLA for production and mission-critical applications while saving up to 80% of the compute costs. Elastigroup predicts the behavior, capacity trends, pricing, and interruption rates of EC2 Spot, and automatically shifts workloads across spot instances and on-demand or reserved, and vice-versa.

It was crucial for Ticketmaster that Elastigroup can integrate with their existing CI/CD pipeline and deployment tools such as Terraform & Gitlab CI.  Also, it was extremely important that Elastigroup will integrate seamlessly with Kubernetes and drive better infrastructure decisions for their complex container deployments. For example, Ticketmaster had different containers and pod sizes and there was a need to allocate instance types dynamically according to the containers needs. In addition, persistent storage was an important requirement and needed to be considered before instances were launched.

Ticketmaster began evaluating Elastigroup and in less than a year, Ticketmaster was using it in production and was able to reduce operating costs on AWS by over 60%. Elastigroup is being used to manage their Kubernetes infrastructure and handle instance and Pod scaling more efficiently with its Pod-Driven Autoscaling – which re-schedules Pods to optimize clusters for performance and costs by scaling up instances based on container metrics and downsizing underutilized nodes. Ticketmaster no longer needed to worry about scaling Pods or infrastructure. Besides reducing costs, they were also able to allocate costs better across different Kubernetes tenants using Elastigroup’s cost show-back.

Ticketmaster uses Terraform to deploy code in their development, test, and production environments. With Elastigroup’s support for Terraform via the Spot provider, Ticketmaster did not have to make any changes to their workflow and was able to deploy their apps using the same processes they had before. They found Elastigroup very easy to use and made their migration from on-prem to AWS easier. With Ticketmaster now using Elastigroup, they have the added benefit of running their workloads with no downtime and a managed SLA.

Automation means that Managing Containers No Longer Takes up Ticketmaster Team’s Time

Whenever there’s a risk of a spot instance interruption, Elastigroup redistributes workloads up to 15 minutes ahead of time, ensuring maximum availability at the best possible price. Elastigroup makes sure applications always run on the most cost-efficient mix of instances and will fallback back to on-demand when spot is not available, in addition to prioritizing any reserved instances you may already own.

Self-managed Kubernetes

Communication between Kubernetes and Spot is done through the Spot Controller which is a pod that resides within your Kubernetes cluster. The controller is responsible for collecting metrics and events. The events are pushed via a one-way secure link to Ocean by Spot for business logic and capacity scale up/down activities.

Ocean by Spot is responsible to aggregate the metrics from the Spot Controller and build the cluster topology. Using the aggregated metrics, the SaaS component is applying other business logic algorithms such as spot instance availability prediction and instance size/type recommendations to increase performance and optimize costs via workload density instance pricing models (across on-demand / reserved and spot instances). This is all done autonomously without the administrator having to worry about sizing and scaling.

There are two key ways in which Elastigroup helps create the most efficient container scaling possible:

  • Tetris Scaling – To cut down on poor efficiency for Kubernetes environments, Elastigroup will analyze event messages when pods fail to start (such as insufficient memory, insufficient CPU, etc). With these messages analyzed, Elastigroup will launch an additional instance of the required size and type. This means that scaling is totally optimized to be as efficient as possible.
  • Smart Scaling Down – Elastigroup will automatically detect and scale down idle instances, where a less than 40% utilization (in terms of both Memory and CPU) has been recorded for a specified number of consecutive periods. When an idle instance is detected, Elastigroup will locate enough spare capacity in the other instances in the cluster, drain the instance pods, reschedule these on other instances and terminate the idle instance. This means that Ticketmaster’s Kubernetes workloads were constantly and automatically self-optimizing with the help of Elastigroup.

The Results

Besides helping Ticketmaster save 66% on their cloud computing costs, automation was also an important benefit for them because managing containers no longer consume large amounts of the team’s time. Orchestrators such as Kubernetes make it easy to place containerized applications onto instances. However, provisioning, managing, and scaling the clusters underneath the containerized application often leads to time-or-cost-intensive developer environments.

With orchestrators managing the scheduling and placement, and Elastigroup managing the underlying cluster of instances, the Ticketmaster developer teams could dedicate their time and energy into what really mattered to them – constantly improving the Ticketmaster applications to ensure constant enhancement of user experience – without having to worry about costs and infrastructure placement.

Ticketmaster is one of the largest household names when it comes to online ticket sales. Completing hundreds of thousands of sales every day, the company is constantly dedicated to adopting and incorporating the best and brightest technologies to help it keep up maintainable growth and innovation. To this end, Ticketmaster is constantly adopting forward-thinking companies, such as Front Gate Tickets, and software technologies to improve their service.