Technical Talks

View All

The deconstructed database at Datadog

Julien Le Dem Julien Le Dem | Principal Engineer | Datadog
Pierre Lacave Pierre Lacave | Staff Engineer | Datadog

Datadog has grown from a startup focused on infrastructure monitoring into a platform processing over a hundred trillion events daily.

Over the years, we expanded beyond metrics to include traces, logs, profiling, real user monitoring, and security. Our user base has broadened from operations to developers, analysts, and business users. More recently, automated agents have also become key consumers of our platform.

Our bottom-up culture encourages teams to take initiative. Consequently, we developed multiple specialized ingestion pipelines and query engines. These were built to satisfy strict real-time requirements and interactive experience, providing users with the insights necessary for success.

This focus on efficiency led to custom-built, proprietary solutions designed for our unique constraints. Today, however, the evolving landscape allows us to reconcile these specialized engines with open standards, blending the versatility of the ecosystem with our purpose-built designs.

In recent years, we have refactored the interfaces of these query engines to create a composable data system. This allows us to better leverage shared capabilities, enabling cross-dataset querying, advanced analytics, and more versatile access patterns.
Our goal is to scale our bottom up culture. By defining clear contracts and high-level components, we enable decentralized decision-making. This improves performance, efficiency, and flexibility across the platform, while reducing silos.

By adopting a deconstructed stack, we combine the efficiency of the open-source ecosystem with our internal capabilities to build a truly composable system. This architecture provides the flexibility to adapt to immediate and future demands, specifically addressing requirements for scale, velocity, and operational resilience, while ensuring readiness for growing challenges such as data intensive operations like AI.

In this talk, we will discuss how we rely on and contribute to key projects in the data ecosystem: Arrow for data interchange, Substrait for plans, Calcite as an optimizer, DataFusion as an execution core, and Parquet for columnar storage.

Julien Le Dem
Julien Le Dem
Principal Engineer  | Datadog

Julien Le Dem is a Principal Engineer at Datadog, serves as an officer of the ASF and is a member of the LFAI&Data Technical Advisory Council. He co-created the Parquet, Arrow and OpenLineage open source projects and is involved in several others. His career leadership began in Data Platforms at Yahoo! - where he received his Hadoop initiation - then continued at Twitter, Dremio and WeWork. He then co-founded Datakin (acquired by Astronomer) to solve Data Observability. His French accent makes his talks particularly attractive.

Pierre Lacave
Pierre Lacave
Staff Engineer | Datadog

Pierre Lacave is a Staff Engineer at Datadog with over 15 years of experience building and operating large-scale data systems. His career spans high-stakes domains, including FinTech, AdTech, and Observability, where he specialized in developing and managing the performance and reliability of massive, real-time analytics systems.

Pierre is a member of the Apache DataSketches PMC, contributing to the development of probabilistic algorithms for big data analysis. Currently, he focuses on Datadog’s query infrastructure, committed to building open, scalable, and resilient distributed systems.