Data streaming has become such a key part of technology often occupying multiple niches but what differentiates data streaming platforms from comparatively simple libraries?
Focusing on the aspect of time windowing and the notion of time, Scala Tech Lead Nadav Wiener, helps us to discover the concepts needed for enabling event-time processing!
About Time: Event-time Stream Processing with Akka Streams
Starting from versatile reactive programming libraries like Akka Streams, and growing up to high-footprint streaming data platforms such as Spark and Flink, data streaming has become an important idiom occupying multiple niches. What differentiates data streaming platforms from comparatively simple libraries? This talk focuses on one specific aspect: time windowing, and the notion of time.
Riskified aggregates large amounts of behavioral data. The kind of stream processing we perform relies on time windowing: incoming event data is partitioned according to bound timeframes.
Time windowing, and the notion of time used, are a key difference between platforms and libraries—platforms offer a robust approach to time windowing based on event-provided timestamps, yielding deterministic results regardless of events arrival time or order; streaming libraries generally just check the local clock.
Our solution brings the notion of event-time to Akka Streams. During the talk I will describe the concepts needed for enabling event-time processing, along with simple building blocks for implementation.
This talk was given by Nadav Wiener at Scalapeño 2018.