Reversim Summit 2019

Full session (30 minutes)

Engineering

big data

For many organizations data is where the money is, but storing all data forever is often times too expensive and unfeasible.

Luckily, today is easy to build data-lakes and data-warehouses that are cheap to build and maintain. Efficient querying on cold data, extended SQL language with features like geo-spatial queries, joins between different data sources (SQL to join data from HDFS, Elasticsearch and Kafka anyone?), and the ability to run on containers and cheap servers - all are features you can and expect from your data-warehouse technology.

In this talk we will present Presto, a Distributed SQL Query Engine for BigData. We will discuss data architectures, Presto features and why is it so good for your data, and then the challenges of ETL and maintaining it for the long run.

Building a modern, cheap and open-source data-warehouse

Itamar Syn-Hershko