The challenge – scaling the monitoring system cost-effectively

In order to make sure that the Capriza platform is running perfectly from end to end, Capriza’s DevOps & Automation teams built a monitoring system that runs many test suites and monitors the network connectivity between its cloud platform and its customers on premises internal network and internal applications. All of these tests are running using a parallelized configuration of Jenkins jobs.

These tests send a lot of metrics and logs into their centralized logs and monitoring systems, which in turn provides valuable insights into the platform’s SLA and performance levels and triggers alerts when there’s an issue with any of its platforms/customers applications.

During the development of this monitoring system, the DevOps team noticed that the number of tests the system was running was growing and that the static Jenkins agent instances they had couldn’t handle the growth and maintain the consistent and high performing monitoring system they needed.

As a result, the team started provisioning extra Jenkins agent instances using a major bare-metal servers provider which was the most cost-efficient solution for extra computing capacity at that point. In retrospect, the cons were the fact that this method was not easily scalable, and at times not as stable as needed.

As a result, Capriza started migrating the Jenkins agents into AWS cloud while knowingly accepting the high costs that it brought with it.

Why Spot

After migrating the Jenkins agent instances into AWS, the Capriza team learned about Elastigroup by Spot and its integration with Jenkins. It was a perfect match to both reduce infrastructure costs while getting a scalable solution which can support the increasing workloads with great stability.

This plugin helps Capriza do more with its Jenkins because they can configure Jenkins to automatically scale designated Amazon EC2 spot instances as agent instances depending on the number of jobs in the Jenkins queue. Capriza also reviewed the official AWS EC2 Jenkins plugin but found it could not deliver the same benefits of scaling and cost reduction as achieved by the Spot plugin.

From the point of view of Tomer Liberman, Sr. DevOps Engineer at Capriza: “The integration was straightforward and with the accompanying help of the Spot team we saw quick results regarding the system stability, scalability, and its costs!”

For more information, head over the Capriza engineering blog post “Scaling Up Automated Testing Infrastructure While Scaling Down Costs”

Shay Peretz, DevOps Manager at Capriza said: “The most amazing thing about using Elastigroup by Spot is that we were able to increase the workload by 50% and reduce overall costs by 50% in only a few months by moving all of our monitoring and testing workloads to Spot.”

Spot’s Jenkins Integration has allowed Capriza DevOps and Automation team to increase the customers monitoring interval and its feedback loops, speeding up issues identification and resolution and as a result providing higher SLA’s.

Nadav Kosovsky, Automation team leader at Capriza: “After moving the workload into AWS using the Spot Jenkins plugin I can now fully focus on developing automation instead of investigating issues originating in faulty infrastructure (solved by moving back into AWS) or load balancing the Jenkins jobs due the shortage of compute capacity, The Jenkins plugin does all of that for me, scaling up and down more agent instances when there is higher demand and a longer jobs queue.”

Shay Peretz, DevOps Manager at Capriza: “ultimately Spot has enabled us to significantly increase the speed of our monitoring system, We can run and monitor our customer’s applications at faster rates allowing us to provide a better service to our customers at a significantly lower costs, leading to higher profitability and customer satisfaction.”

Capriza has developed a unique platform that can simplify and extend internal corporate applications such as SAP, PeopleSoft, Oracle or custom legacy applications and create a mobile application in a fraction of the time of traditional approaches. It reduces the overhead of developing a tailor-made native mobile application and of course the large cost of such project.

Capriza has hundreds of enterprise customers and more than 1M end users. Capriza’s platform is highly sophisticated and uses a distributed architecture to deliver its solutions for its customers.

https://www.capriza.com/