Kubernetes has become the de-facto platform for running containerized workloads on a cluster.
But how do you manage this complexity?
Kubernetes as a Streaming Data Platform with Kafka, Spark, and Scala
Kubernetes has become the de-facto platform for running containerized workloads on a cluster. Its native abstractions let us deploy and manage individual distributed applications and services.
In contrast, streaming applications require the coordinated deployment of different components such as topics, storage, and application runtimes that need to work in unison.
In this talk, we are going to discuss how to use Kubernetes Operators to manage this complexity. In particular, we will see:
- How to use the Kafka Operator to manage topics
- How to use the Spark Operator to deploy and manage Spark Structured Streaming applications
- How a custom operator can harness Kubernetes resources and other operators to implement the operational requirements of our applications.
We will share our experience building an operator for streaming data pipelines using Scala and show how this all works in real life.