Full session (30 minutes)
Engineering
Day 2 | 11:10-11:40 | A3
Early detection of abnormal events can be critical for many business applications, however there are numerous challenges when implementing real-time anomaly models at scale. Moreover, most analytical models have been traditionally designed for the batch processing paradigm and usually cannot be easily adapted to unbounded datasets and real-time latencies.
At PayPal, we've built a generic framework for developing robust and scalable anomaly detection streaming applications, focusing on flexibility. Inspired by the design of scikit-learn and Spark MLlib, we've designed a simple pipeline-based API on top of Spark Structured Streaming, to capture common patterns of the anomaly detection domain.