Connecting...

Bright Celebration Close Up 282045

Processing Fast Data with Apache Spark: The Tale of Two Streaming APIs by Gerard Maas

Bright Celebration Close Up 282045

Fast Data with Apache Spark, we have lots of questions!

At Reactive Summit, Principal Engineer Gerard Maas gave us the tale of two streaming APIs. From their virtues and capabilities to key differences, the tale tells all.


Processing Fast Data with Apache Spark: The Tale of Two Streaming APIs

Fast Data architectures provide an answer to the increasing need for the enterprise to process and analyze continuous streams of data, which helps accelerate decision making and enables faster responses to changing characteristics of their market. Apache Spark is a popular framework for data analytics. Its capabilities in the streaming domain are represented by two APIs: The low-level Spark Streaming and the more declarative Structured Streaming, which builds upon the recent advances in Spark SQL query optimization and code generation. 
 
 After a quick introduction to both APIs, we will discuss their virtues, capabilities and key differences:
 - How to get started: ease of development.
 - How to deal with time: both at the processing and event level
 - How to deal with state: locally, distributed and its relation with time
 - How to migrate: functional coding strategies
 - How to integrate: Fast Data and microservices
 
 Using a practical approach supported by live demonstrations, we will provide insights into the sweet spot of each API, guidance on how to choose one or even combine both APIs to implement functional and resilient streaming pipelines.


This talk was given by Gerard Maas at Reactive Summit.