The Challenge Weather2020 extracts weather data from public agencies around the globe in a variety of formats – including industry-specific formats which are not suited for big data. They work with time series data spanning over 40 years of meteorological and geospatial data. They needed pipelines to pull this data, clean it, enrich it, aggregate […]

Lingk.io is a data loading, data pipelines, and integration platform built on top of Apache Spark, serving commercial customers, with expertise in the education sector. Their visual interface makes it easy to load, deduplicate and enrich data from dozens of sources, and promote projects from development to production in a few clicks. They were looking […]

The Background: The mission of the United Nations Global Platform Under the governance of the UN Committee of Experts on Big Data and Data Science for Official Statistics (UN-CEBD), the Global Platform has built a cloud-service ecosystem to support international collaboration in the development of Official Statistics using new data sources, including big data, and […]

The Challenge Running Big Data workloads is common in the Account-based Marketing industry and requires a large number of computing resources. Demandbase utilizes hundreds of resource-intensive instances to process hundreds of terabytes of data. As Demandbase became more successful, their user base grew substantially and their infrastructure had to scale accordingly. At this point, costs […]

Spotad Keen on Spot Instances, but Environments Don’t Seem Immediately Compatible Like any forward-thinking business, Spotad always kept one eye on their rising EC2 costs. Naturally, as the company went from success to success, these costs began to increase and Tal Maizels, CTO at Spotad, was determined to find a way to manage these rising […]

The Challenge A Petabyte customer faced a significant challenge to swiftly reduce operational costs. One focal aspect of their optimization target was the IT expenditure, the largest cost of which was in hosting their datacenter. Their target was to reduce infrastructure costs by at least 45%. For the customer, Petabyte had run an SAP/ECC FMS […]