Full session (30 minutes)
big data

We will present our journey from batch-based to real-time analytics. We implemented a data-pipeline lambda architecture using Spark Streaming for real-time analytics. We will introduce the main components of this architecture and will overview its different use-cases. The first use-case is to build and run real-time predictive analytics using Contextual Multi-Armed Bandit models for UI optimisation. The second use case is to do CTR (click through rate) estimation based on real-time data using weighted linear regression models.

In the end of the session, you will learn how to use Spark to combine real-time and batch analytics, and become more familiar with Spark's capabilities.

Yulia Stolin