Once you've learned more about spark internals in Part 1: Spark Internals & Streaming of Workshop Track 2: Spark Streaming and Internals, the instructor, Ronak Nathani, will lead you through practical, hands-on Spark streaming examples and building a microservices architecture using Kafka as a distributed message bus.
In this workshop, you will learn :
- Introduction to streaming technologies - Spark streaming and Kafka Streams
- Real-time vs micro-batching
- Processing guarantees
- Hands-on Spark Streaming example
- Fundamentals of Kafka
- Brokers
- Topics and offsets
- Partitions and parallelism
- Producers
- Consumers and consumer groups
- Building a microservices architecture using Kafka as a distributed message bus
- How Kafka can be used as a backbone of a microservices architecture?
- Kafka Connect
- Why Kafka Connect and what is it?
- Kafka connect model
- Hands-on examples
- Kafka Streams
- Why Kafka Streams and what is it?
- Parallelism and fault tolerance
- State and memory management
- Scaling up/down
- Data model
- Hands-on examples
Level:
Intermediate - Advanced
Prerequisites:
- A machine with a unix based OS or a virtual environment supporting one
- Familiarity with command line
- Some experience programming in Scala
- All hands-on examples and projects will be executed on distributed Kafka clusters on AWS and the environment will be pre-configured for everyone
Meet Your Instructor:

The workshop is lead by Ronak Nathani, Senior Data Engineer at Insight Data Science. He is currently working on developing and deploying microservices using Kubernetes on AWS for Insight’s internal tools and services. He has been working on building and advancing technical content for the Data Engineering Fellows Program and developing tools to support the Data Engineering team and Fellows. He joined Insight as a Data Engineering Fellow and then co-led the Data Engineering Fellows program for a year.


