Buy Early Bird Tickets for AI Council 2026!

Technical Talks

Benn Stancil
Benn Stancil
Founder | Mode
Jason  Ganz
Jason Ganz
Director, DX + AI | dbt Labs

Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench

video
Missing value detected...
Video will be populated after the conference

  • Coding Agents & Autonomous Dev

There are many benchmarks that attempt to measure how well LLMs and AI agents can write SQL queries or do complicated statistical analysis. But as most practitioners know, this is only a small part of our job. Before we can write a query, we have to figure out the business context behind the question. We have figure out which tables to use in a messy database. We have to make subjective decisions about vaguely defined problems. All of this makes benchmarking analytical agents difficult.

We built a new benchmark—ADE-bench—that aspires to do exactly that. It gives agents complex analytical environments to work in and ambiguous tasks to solve, and measures how well they perform.

In this talk, we'll share how we built the benchmark, the results of our tests, a bunch of things we learned along the way, and what we think is coming next. The benchmark harness is open source, and can be found here: https://github.com/dbt-labs/ade-bench

Benn Stancil

Founder

Benn Stancil

Mode

Benn Stancil is a cofounder of Mode, an analytics and BI company that was bought by ThoughtSpot in 2023. While at Mode, Benn held roles leading Mode’s data, product, marketing, and executive teams; at ThoughtSpot, he was the Field CTO. More recently, Benn worked on the analytics team on the Harris for President campaign. He regularly writes about data and technology at benn.substack.com. 

Jason  Ganz

Director, DX + AI

Jason  Ganz

dbt Labs

Jason Ganz used to call himself a futurist but frankly isn't certain what one can do with that word these days. He is the Director of Developer Experience and AI at dbt Labs. You can find him across the internet thinking about how to build resilience and navigate the AI transition while retaining our humanity, treating each other well and having fun.