ABOUT THE TALK

Measuring & aggregating network request duration percentiles and error rate are essential for Uber teams to monitor the reliability of network libraries and catch performance issues, at a global scale across 600+ cities/4500+ mobile carriers. In this talk, we will discuss how we used the Hudi Delta Streamer to build Spark pipelines to incrementally pull data sources and generate network metrics, powering near real-time network dashboards. Hudi is an open-source data format managing storage of large analytical datasets, providing efficient upsert and incremental processing primitives.

We will describe the design of incrementally updating different types of network metrics using Hudi upserts and how it reduces ingestion and processing latency by taking advantage of incremental pulls. We will also share our experience on launching and running near real-time network analytics pipelines at Uber.

Nishith Agarwal

Senior Software Engineer | Uber

Nishith Agarwal is a software engineer at Uber. He is an Apache Hudi PPMC, leading Hudi efforts at Uber. His interests lie in large scale distributed systems. Nishith is one of the initial engineers of Uber’s data team and helped scale Uber's data platform to over 100 petabytes while reducing data latency from hours to minutes.

Nishith Agarwal
BUY TICKETS


VIEW ON MAP

Location subheader text. Can be left blank if not needed.

Company Name

Company address, lorem ipsum dolor sit amet

BROUGHT TO YOU BY:

partner-85.png
partner-canvas.png
partner-dropbox.png

FEATURED MEETINGS