Buy Early Bird Tickets for AI Council 2026!

MAY 12-14, 2026 | SAN FRANCISCO, CA

The AI Conference for Humans Who Ship

Meet the world's top AI infrastructure minds at AI Council where architects of AI infrastructure share what actually works. Experience THREE DAYS of high-quality technical talks and meaningful interactions with the engineers & teams building our AI driven future.


Speakers Who Ship Code, Not Slides

AI Council features hand-selected engineers building production systems, teams shipping billion-parameter inference and experts solving AI's hardest challenges.

Benn Stancil
Benn Stancil
Founder
Mode
Benn Stancil
Founder
Mode
Founder Mode
Coding Agents & Autonomous Dev
Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench
There are many benchmarks that attempt to measure how well LLMs and AI agents can write SQL queries or do complicated statistical analysis. But as most practitioners know, this is only a small part of our job. Before we can write a query, we have to figure out the business context behind the question. We have figure out which tables to use in a messy database. We have to make subjective decisions about vaguely defined problems. All of this makes benchmarking analytical agents difficult.

We built a new benchmark—ADE-bench—that aspires to do exactly that. It gives agents complex analytical environments to work in and ambiguous tasks to solve, and measures how well they perform.

In this talk, we'll share how we built the benchmark, the results of our tests, a bunch of things we learned along the way, and what we think is coming next. The benchmark harness is open source, and can be found here: https://github.com/dbt-labs/ade-bench
Read More
Vasundra Srinivasan
Vasundra Srinivasan
Director AI Architect Lead
Salesforce
Vasundra Srinivasan
Director AI Architect Lead
Salesforce
Director AI Architect Lead Salesforce
AI Engineering
Running Enterprise Agents in Production: Architecture and Secure Execution Models
I lead large-scale AI implementations and design agents for some of the most security-sensitive enterprises in the world. The pattern is always the same: prototypes and MVPs are easy, running agents in production is not. The moment an agent touches real data and real systems, security, policy, identity, orchestration, and auditability dominate the design. You also discover how many teams must align to make it safe. Without a shared and evolving playbook, the deployment stalls. In this talk I outline the approach I use to build enterprise agents that actually run in production. I cover the core architecture, the supervisory runtime, and the secure execution model that keeps agents predictable: identity boundaries, retrieval controls, RBAC and ABAC alignment, sandboxed tools, and evaluation loops. Through this talk I'd like to show what it really takes to operate agents inside large organizations and provide a practical blueprint you can adapt to your own environment.
Read More
Marta  Paes
Marta Paes
Senior Product Manager
Clickhouse
Marta Paes
Senior Product Manager
Clickhouse
Senior Product Manager Clickhouse
Lightning Talks
Data Lake CDC: are we there yet?
The idea of incremental reads from data lakes has been cooking for years, but few are serving it up. As a user, you must wrangle change feeds, snapshots, time travel, that one corrupted manifest file. Do you need to be a "Big Data Engineer" to get it right? In this lightning talk, we’ll explore what’s broken, what's just hard, and why making data lake CDC accessible is a problem worth solving.
Read More
Zach Mueller
Zach Mueller
Head of Developer Relations
Lambda
Zach Mueller
Head of Developer Relations
Lambda
Head of Developer Relations Lambda
Model Systems
Optimizing Model Training End-to-End: A Tiny MoE Case Study
Cloud compute is expensive, and wasting runs on the guise of a "just scale will fix any problems" leaves you with less time to fix errors, and less compute to train the model you want. In this talk, I will discuss what are the easy optimizatiosn you might miss (minimizing communications, using the most effective algorithms, ensuring you're getting the most FLOPs possible) at the small scale, before ensuring that when you do scale up nothing is going to waste. In this particular talk, I'll be focusing on what worked at home, that then let me scale it further onto the cloud.
Read More
Leonhard Spiegelberg
Leonhard Spiegelberg
Member of Technical Staff
OpenAI
Leonhard Spiegelberg
Member of Technical Staff
OpenAI
Member of Technical Staff OpenAI
Lightning Talks
The Modern Data Stack Lost the War: Stop Building more DataFrame APIs
After more than a decade of limited innovation, the modern data stack is still slow, fragmented, and painfully repetitive. Every new dataframe library promises better ergonomics or performance, yet converges on the same API with the same limitations. The failure isn’t execution or scale, it is ignoring that appearance doesn't matter anymore: Agents and AI-first developers couldn't care less about how an API looks today. In this talk we’ll examine why the dataframe paradigm keeps reproducing itself, why it consistently underdelivers, and why the data stack’s biggest problems were never going to be solved at the API layer first. Princess Leia once put her trust into Obi-Wan Kenobi - what if we did the same with simple Python functions? Could we defeat the evil empire of DataFrame APIs?
Read More
Chris Alexiuk
Chris Alexiuk
Product Research Engineer
NVIDIA
Chris Alexiuk
Product Research Engineer
NVIDIA
Product Research Engineer NVIDIA
Model Systems
RLVR in Practice: From Synthetic Data to GRPO
Reinforcement Learning from Verifiable Rewards (RLVR) is increasingly common in post-training pipelines, but the practical details are often glossed over. How do you design reward functions that programmatically verify model outputs? What makes synthetic training data effective? How do you build a custom RL environment that doesn't silently break your training?
Read More
Elie Bakouch
Elie Bakouch
Lead Researcher
Hugging Face
Elie Bakouch
Lead Researcher
Hugging Face
Lead Researcher Hugging Face
Model Systems
How Open Frontier Labs Actually Train Their Models
Training a large language model is an exercise in tradeoffs you didn't expect. Should you spend a week optimizing infrastructure and architecture, or just start training? This talk covers how to think about pre-training decisions: why architecture changes are rarely about accuracy and almost always about performance, how frontier open labs actually design models, and how to make principled calls when everything is a tradeoff. We'll walk through real examples of decisions that looked obvious in retrospect, and the ones that still don't have clean answers.
Read More
Naheil McAvinue
Naheil McAvinue
Director, Enterprise Analytics & Data Science
Gitlab
Naheil McAvinue
Director, Enterprise Analytics & Data Science
Gitlab
Director, Enterprise Analytics & Data Science Gitlab
Applied AI
Breaking the Proof-of-Concept Cycle: Stop Prototyping and Get Into Production
Most organizations are experimenting with AI at work. Proof of concepts are thriving in small-scale silos, but how can you scale those wins across an enterprise. This talk discusses the symptoms that trap teams in the cycle of perpetual prototyping and provides a practical framework for breaking through common technical, organizational, and cultural barriers to scale. Learn how to plan for production from day one and turn your AI experiments into scalable, business-impacting solutions.
Read More
Lucas Atkins
Lucas Atkins
CTO
Arcee
Lucas Atkins
CTO
Arcee
CTO Arcee
Model Systems
Trinity: Training a 400B MoE from Scratch Without Losing Your Mind
Training sparse Mixture-of-Experts models at scale is notoriously unstable. Experts collapse, routers drift, and loss spikes appear out of nowhere. This talk covers how we built Trinity Large, a 400B parameter MoE (13B active), trained on 17 trillion tokens with zero loss spikes.

We'll walk through the decisions that actually mattered: why we replaced standard aux-loss-free balancing with a momentum-based approach (SMEBU), how interleaved local/global attention made context extension surprisingly smooth, and what broke when we first tried running Muon at scale.

I'll also cover the less glamorous stuff: our Random Sequential Document Buffer to reduce batch heterogeneity, recovering from B300 GPU faults on brand-new hardware, and the six changes we shipped at once when routing started collapsing mid-run.

Practical lessons for teams training their own MoEs or scaling up sparse architectures
Read More
Vik Korrapati
Vik Korrapati
CTO
M87 Labs (Moondream)
Vik Korrapati
CTO
M87 Labs (Moondream)
CTO M87 Labs (Moondream)
Model Systems
No Dropped Frames: Designing a VLM around a Latency Budget
Moondream is a vision language model that runs in real time on video streams. This talk covers the model-side work behind it.

I'll start with architecture: upcycling from dense to MoE, and the tradeoffs when you're optimizing for latency rather than just parameter count. Then tokenization: why we built a custom SuperBPE tokenizer and what it bought us. The goal throughout was to avoid modeling decisions that would hurt us at inference time.

I'll also cover training infrastructure. We wrote custom training engines and RL systems because existing open source projects were pushing us toward design decisions that didn't fit. I'll talk about where we diverged and what we got out of it.

Finally, inference. Real-time VLM isn't just a serving problem or a modeling problem. We built a custom inference engine alongside the model, and I'll cover how the two informed each other.
Read More
Nikhil Benesch
Nikhil Benesch
CTO
Turbopuffer
Nikhil Benesch
CTO
Turbopuffer
CTO Turbopuffer
Data Eng & Databases
Fast AI Search on Object Storage @ >1 Trillion Scale

This talk goes beyond architecture diagrams to share what actually happens when you operate an agentic search engine on trillions of documents.

We'll dig into how an object storage-native design allows a small team of engineers to manage an AI search engine that scales to:

  • Peak load of 1M+ writes per second and 30k+ searches per second
  • 1+ trillion documents
  • 5+ PB of logical data
  • 400+ tenants
  • p90 query latency <100 ms
Topics include:
  • How using a modern storage architecture decreases COGS by 10x or more
  • Optimizing traditional vector and FTS indexes for the high latency of object storage
  • Building search algorithms that are fine-tuned for LLM-initiated searches
  • A simple rate-limiting technique that provides strong performance isolation in multi-tenant environments
  • Observability, reliability, and performance lessons learned from production incidents.

Attendees will leave with a concrete understanding of how separating storage from compute-and treating object storage as the primary database changes not only the cost structure, but the entire operational model of large-scale AI search.
Read More
Jason  Ganz
Jason Ganz
Director, DX + AI
dbt Labs
Jason Ganz
Director, DX + AI
dbt Labs
Director, DX + AI dbt Labs
Coding Agents & Autonomous Dev
Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench
There are many benchmarks that attempt to measure how well LLMs and AI agents can write SQL queries or do complicated statistical analysis. But as most practitioners know, this is only a small part of our job. Before we can write a query, we have to figure out the business context behind the question. We have figure out which tables to use in a messy database. We have to make subjective decisions about vaguely defined problems. All of this makes benchmarking analytical agents difficult.

We built a new benchmark—ADE-bench—that aspires to do exactly that. It gives agents complex analytical environments to work in and ambiguous tasks to solve, and measures how well they perform.

In this talk, we'll share how we built the benchmark, the results of our tests, a bunch of things we learned along the way, and what we think is coming next. The benchmark harness is open source, and can be found here: https://github.com/dbt-labs/ade-bench
Read More
Ezi Ozoani
Ezi Ozoani
CTO
Aethon.fund
Ezi Ozoani
CTO
Aethon.fund
CTO Aethon.fund
Model Systems
Lessons From RL Systems That Looked Fine Until They Didn't
Reinforcement learning systems often fail not because rewards are wrong, but because optimization pressure is unbounded. Policies exploit edge cases, drift over time, and converge to brittle strategies that look fine in training but break in deployment, especially under bounded actions, safety requirements, resource budgets, and long-term user impact.

This talk focuses on controlling optimization directly: practical techniques for training RL agents that remain stable and predictable under hard constraints. Rather than modifying rewards, we explore structural and system-level approaches that shape behavior by construction.

Topics include:
  • Why reward penalties alone fail to enforce hard constraints under scale and distribution shift
  • Structural constraint mechanisms such as action masking, feasibility filters, and sandboxed execution
  • How training inside hard boundaries changes policy behavior and improves long-horizon stability, including across retraining cycles
  • Detecting constraint violations and failure modes that do not appear in aggregate return metrics
  • Lessons from applying constrained RL in production-like systems, including failures only discovered after deployment and what ultimately stopped them
  • The goal is to share concrete algorithmic and system design strategies for deploying reinforcement learning in settings where violations are suboptimal.
Read More
Rohan Kodialam
Rohan Kodialam
Co-founder & CEO
Sphinx
Rohan Kodialam
Co-founder & CEO
Sphinx
Co-founder & CEO Sphinx
Lightning Talks
The (AI) Data Scientist Isn't a Software Engineer
Software engineering has seen a massive lift from AI augmentation over the last 3 years. However, data teams have been left behind. The fundamental loop of software engineering differs significantly from that of data science, making distinct agents necessary.

  • Why does AI struggle with quantitative data, and where are there divergences from human experts?
  • What data representations are most meaningful for today's frontier AI models?
  • How can better data representation let AI become a partner in data science?
  • What are the best practices for deploying AI to teams to maximize acceleration without compromising quality or security?
Read More
Gaurav Nuti
Gaurav Nuti
Software Engineer
Snowflake
Gaurav Nuti
Software Engineer
Snowflake
Software Engineer Snowflake
Applied AI
How to Unlock Enterprise Value by Training Your Own Language Models
At Snowflake I have been straddling between product engineering and model training. I have been involved with training snowflake's own embedding model series arctic embed v1 and arctic embed v2, and snowflake's text2sql model. This talk is about how at Snowflake we decided when to train a model vs when to take advantage of open source models / partner with frontier model providers.  Next I describe briefly the models we have trained and the technical learnings we discovered along the way.
Read More
Kshitij Grover
Kshitij Grover
CTO
Orb
Kshitij Grover
CTO
Orb
CTO Orb
View all speakers
Benn Stancil
Benn Stancil
Founder
Mode
Benn Stancil
Founder
Mode
Founder Mode
Coding Agents & Autonomous Dev
Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench
There are many benchmarks that attempt to measure how well LLMs and AI agents can write SQL queries or do complicated statistical analysis. But as most practitioners know, this is only a small part of our job. Before we can write a query, we have to figure out the business context behind the question. We have figure out which tables to use in a messy database. We have to make subjective decisions about vaguely defined problems. All of this makes benchmarking analytical agents difficult.

We built a new benchmark—ADE-bench—that aspires to do exactly that. It gives agents complex analytical environments to work in and ambiguous tasks to solve, and measures how well they perform.

In this talk, we'll share how we built the benchmark, the results of our tests, a bunch of things we learned along the way, and what we think is coming next. The benchmark harness is open source, and can be found here: https://github.com/dbt-labs/ade-bench
Read More
Vasundra Srinivasan
Vasundra Srinivasan
Director AI Architect Lead
Salesforce
Vasundra Srinivasan
Director AI Architect Lead
Salesforce
Director AI Architect Lead Salesforce
AI Engineering
Running Enterprise Agents in Production: Architecture and Secure Execution Models
I lead large-scale AI implementations and design agents for some of the most security-sensitive enterprises in the world. The pattern is always the same: prototypes and MVPs are easy, running agents in production is not. The moment an agent touches real data and real systems, security, policy, identity, orchestration, and auditability dominate the design. You also discover how many teams must align to make it safe. Without a shared and evolving playbook, the deployment stalls. In this talk I outline the approach I use to build enterprise agents that actually run in production. I cover the core architecture, the supervisory runtime, and the secure execution model that keeps agents predictable: identity boundaries, retrieval controls, RBAC and ABAC alignment, sandboxed tools, and evaluation loops. Through this talk I'd like to show what it really takes to operate agents inside large organizations and provide a practical blueprint you can adapt to your own environment.
Read More
Marta  Paes
Marta Paes
Senior Product Manager
Clickhouse
Marta Paes
Senior Product Manager
Clickhouse
Senior Product Manager Clickhouse
Lightning Talks
Data Lake CDC: are we there yet?
The idea of incremental reads from data lakes has been cooking for years, but few are serving it up. As a user, you must wrangle change feeds, snapshots, time travel, that one corrupted manifest file. Do you need to be a "Big Data Engineer" to get it right? In this lightning talk, we’ll explore what’s broken, what's just hard, and why making data lake CDC accessible is a problem worth solving.
Read More
Zach Mueller
Zach Mueller
Head of Developer Relations
Lambda
Zach Mueller
Head of Developer Relations
Lambda
Head of Developer Relations Lambda
Model Systems
Optimizing Model Training End-to-End: A Tiny MoE Case Study
Cloud compute is expensive, and wasting runs on the guise of a "just scale will fix any problems" leaves you with less time to fix errors, and less compute to train the model you want. In this talk, I will discuss what are the easy optimizatiosn you might miss (minimizing communications, using the most effective algorithms, ensuring you're getting the most FLOPs possible) at the small scale, before ensuring that when you do scale up nothing is going to waste. In this particular talk, I'll be focusing on what worked at home, that then let me scale it further onto the cloud.
Read More
Leonhard Spiegelberg
Leonhard Spiegelberg
Member of Technical Staff
OpenAI
Leonhard Spiegelberg
Member of Technical Staff
OpenAI
Member of Technical Staff OpenAI
Lightning Talks
The Modern Data Stack Lost the War: Stop Building more DataFrame APIs
After more than a decade of limited innovation, the modern data stack is still slow, fragmented, and painfully repetitive. Every new dataframe library promises better ergonomics or performance, yet converges on the same API with the same limitations. The failure isn’t execution or scale, it is ignoring that appearance doesn't matter anymore: Agents and AI-first developers couldn't care less about how an API looks today. In this talk we’ll examine why the dataframe paradigm keeps reproducing itself, why it consistently underdelivers, and why the data stack’s biggest problems were never going to be solved at the API layer first. Princess Leia once put her trust into Obi-Wan Kenobi - what if we did the same with simple Python functions? Could we defeat the evil empire of DataFrame APIs?
Read More
Chris Alexiuk
Chris Alexiuk
Product Research Engineer
NVIDIA
Chris Alexiuk
Product Research Engineer
NVIDIA
Product Research Engineer NVIDIA
Model Systems
RLVR in Practice: From Synthetic Data to GRPO
Reinforcement Learning from Verifiable Rewards (RLVR) is increasingly common in post-training pipelines, but the practical details are often glossed over. How do you design reward functions that programmatically verify model outputs? What makes synthetic training data effective? How do you build a custom RL environment that doesn't silently break your training?
Read More
Elie Bakouch
Elie Bakouch
Lead Researcher
Hugging Face
Elie Bakouch
Lead Researcher
Hugging Face
Lead Researcher Hugging Face
Model Systems
How Open Frontier Labs Actually Train Their Models
Training a large language model is an exercise in tradeoffs you didn't expect. Should you spend a week optimizing infrastructure and architecture, or just start training? This talk covers how to think about pre-training decisions: why architecture changes are rarely about accuracy and almost always about performance, how frontier open labs actually design models, and how to make principled calls when everything is a tradeoff. We'll walk through real examples of decisions that looked obvious in retrospect, and the ones that still don't have clean answers.
Read More
Naheil McAvinue
Naheil McAvinue
Director, Enterprise Analytics & Data Science
Gitlab
Naheil McAvinue
Director, Enterprise Analytics & Data Science
Gitlab
Director, Enterprise Analytics & Data Science Gitlab
Applied AI
Breaking the Proof-of-Concept Cycle: Stop Prototyping and Get Into Production
Most organizations are experimenting with AI at work. Proof of concepts are thriving in small-scale silos, but how can you scale those wins across an enterprise. This talk discusses the symptoms that trap teams in the cycle of perpetual prototyping and provides a practical framework for breaking through common technical, organizational, and cultural barriers to scale. Learn how to plan for production from day one and turn your AI experiments into scalable, business-impacting solutions.
Read More
Lucas Atkins
Lucas Atkins
CTO
Arcee
Lucas Atkins
CTO
Arcee
CTO Arcee
Model Systems
Trinity: Training a 400B MoE from Scratch Without Losing Your Mind
Training sparse Mixture-of-Experts models at scale is notoriously unstable. Experts collapse, routers drift, and loss spikes appear out of nowhere. This talk covers how we built Trinity Large, a 400B parameter MoE (13B active), trained on 17 trillion tokens with zero loss spikes.

We'll walk through the decisions that actually mattered: why we replaced standard aux-loss-free balancing with a momentum-based approach (SMEBU), how interleaved local/global attention made context extension surprisingly smooth, and what broke when we first tried running Muon at scale.

I'll also cover the less glamorous stuff: our Random Sequential Document Buffer to reduce batch heterogeneity, recovering from B300 GPU faults on brand-new hardware, and the six changes we shipped at once when routing started collapsing mid-run.

Practical lessons for teams training their own MoEs or scaling up sparse architectures
Read More
Vik Korrapati
Vik Korrapati
CTO
M87 Labs (Moondream)
Vik Korrapati
CTO
M87 Labs (Moondream)
CTO M87 Labs (Moondream)
Model Systems
No Dropped Frames: Designing a VLM around a Latency Budget
Moondream is a vision language model that runs in real time on video streams. This talk covers the model-side work behind it.

I'll start with architecture: upcycling from dense to MoE, and the tradeoffs when you're optimizing for latency rather than just parameter count. Then tokenization: why we built a custom SuperBPE tokenizer and what it bought us. The goal throughout was to avoid modeling decisions that would hurt us at inference time.

I'll also cover training infrastructure. We wrote custom training engines and RL systems because existing open source projects were pushing us toward design decisions that didn't fit. I'll talk about where we diverged and what we got out of it.

Finally, inference. Real-time VLM isn't just a serving problem or a modeling problem. We built a custom inference engine alongside the model, and I'll cover how the two informed each other.
Read More
Nikhil Benesch
Nikhil Benesch
CTO
Turbopuffer
Nikhil Benesch
CTO
Turbopuffer
CTO Turbopuffer
Data Eng & Databases
Fast AI Search on Object Storage @ >1 Trillion Scale

This talk goes beyond architecture diagrams to share what actually happens when you operate an agentic search engine on trillions of documents.

We'll dig into how an object storage-native design allows a small team of engineers to manage an AI search engine that scales to:

  • Peak load of 1M+ writes per second and 30k+ searches per second
  • 1+ trillion documents
  • 5+ PB of logical data
  • 400+ tenants
  • p90 query latency <100 ms
Topics include:
  • How using a modern storage architecture decreases COGS by 10x or more
  • Optimizing traditional vector and FTS indexes for the high latency of object storage
  • Building search algorithms that are fine-tuned for LLM-initiated searches
  • A simple rate-limiting technique that provides strong performance isolation in multi-tenant environments
  • Observability, reliability, and performance lessons learned from production incidents.

Attendees will leave with a concrete understanding of how separating storage from compute-and treating object storage as the primary database changes not only the cost structure, but the entire operational model of large-scale AI search.
Read More
Jason  Ganz
Jason Ganz
Director, DX + AI
dbt Labs
Jason Ganz
Director, DX + AI
dbt Labs
Director, DX + AI dbt Labs
Coding Agents & Autonomous Dev
Benchmarking AI Agents Against Realistic Analytical Tasks with ADE-bench
There are many benchmarks that attempt to measure how well LLMs and AI agents can write SQL queries or do complicated statistical analysis. But as most practitioners know, this is only a small part of our job. Before we can write a query, we have to figure out the business context behind the question. We have figure out which tables to use in a messy database. We have to make subjective decisions about vaguely defined problems. All of this makes benchmarking analytical agents difficult.

We built a new benchmark—ADE-bench—that aspires to do exactly that. It gives agents complex analytical environments to work in and ambiguous tasks to solve, and measures how well they perform.

In this talk, we'll share how we built the benchmark, the results of our tests, a bunch of things we learned along the way, and what we think is coming next. The benchmark harness is open source, and can be found here: https://github.com/dbt-labs/ade-bench
Read More
Ezi Ozoani
Ezi Ozoani
CTO
Aethon.fund
Ezi Ozoani
CTO
Aethon.fund
CTO Aethon.fund
Model Systems
Lessons From RL Systems That Looked Fine Until They Didn't
Reinforcement learning systems often fail not because rewards are wrong, but because optimization pressure is unbounded. Policies exploit edge cases, drift over time, and converge to brittle strategies that look fine in training but break in deployment, especially under bounded actions, safety requirements, resource budgets, and long-term user impact.

This talk focuses on controlling optimization directly: practical techniques for training RL agents that remain stable and predictable under hard constraints. Rather than modifying rewards, we explore structural and system-level approaches that shape behavior by construction.

Topics include:
  • Why reward penalties alone fail to enforce hard constraints under scale and distribution shift
  • Structural constraint mechanisms such as action masking, feasibility filters, and sandboxed execution
  • How training inside hard boundaries changes policy behavior and improves long-horizon stability, including across retraining cycles
  • Detecting constraint violations and failure modes that do not appear in aggregate return metrics
  • Lessons from applying constrained RL in production-like systems, including failures only discovered after deployment and what ultimately stopped them
  • The goal is to share concrete algorithmic and system design strategies for deploying reinforcement learning in settings where violations are suboptimal.
Read More
Rohan Kodialam
Rohan Kodialam
Co-founder & CEO
Sphinx
Rohan Kodialam
Co-founder & CEO
Sphinx
Co-founder & CEO Sphinx
Lightning Talks
The (AI) Data Scientist Isn't a Software Engineer
Software engineering has seen a massive lift from AI augmentation over the last 3 years. However, data teams have been left behind. The fundamental loop of software engineering differs significantly from that of data science, making distinct agents necessary.

  • Why does AI struggle with quantitative data, and where are there divergences from human experts?
  • What data representations are most meaningful for today's frontier AI models?
  • How can better data representation let AI become a partner in data science?
  • What are the best practices for deploying AI to teams to maximize acceleration without compromising quality or security?
Read More
Gaurav Nuti
Gaurav Nuti
Software Engineer
Snowflake
Gaurav Nuti
Software Engineer
Snowflake
Software Engineer Snowflake
Applied AI
How to Unlock Enterprise Value by Training Your Own Language Models
At Snowflake I have been straddling between product engineering and model training. I have been involved with training snowflake's own embedding model series arctic embed v1 and arctic embed v2, and snowflake's text2sql model. This talk is about how at Snowflake we decided when to train a model vs when to take advantage of open source models / partner with frontier model providers.  Next I describe briefly the models we have trained and the technical learnings we discovered along the way.
Read More
Kshitij Grover
Kshitij Grover
CTO
Orb
Kshitij Grover
CTO
Orb
CTO Orb
Naveen Rao
Naveen Rao
VP of AI
Databricks
Naveen Rao
VP of AI
Databricks
VP of AI Databricks
Denis Yarats
Denis Yarats
Co-Founder & CTO
Perplexity
Denis Yarats
Co-Founder & CTO
Perplexity
Co-Founder & CTO Perplexity
Aaron Katz
Aaron Katz
Co-Founder & CEO
Clickhouse
Aaron Katz
Co-Founder & CEO
Clickhouse
Co-Founder & CEO Clickhouse
View all speakers
Data-Council-2024-Day-1-Tico-Mendoza-All-3435-1
Data-Council-2024-Day-2-Tico-Mendoza--6035-2
Data-Council-2024-Tico-Mendoza--4516
Data-Council-Day-One-2025-130-1
Tico-Mendoza-Event-Photography-Austin-Data-Council-55
Data-Council-2024-Day-1-Tico-Mendoza-All-2558
Data-Council-Day-One-2025-185

Real Industry Intelligence

For 13 years, AI Council has brought together the engineers behind breakthrough AI systems: researchers solving training at scale, teams optimizing inference, and practitioners shipping models to millions.

+

Technical Attendees

+

Speakers

+

Years Running

Why Attend?

  • Technical Deep-Dives
  • Engineering Office Hours
  • Hands-On Workshops
  • Events & Networking

Technical Deep-Dives

Tico-Mendoza-Event-Photography-Austin-Data-Council-12

Get direct insights on production systems and architectural decisions from technical leaders. Our hand-selected speakers don't just present slides. They pull back the curtain on real implementations, complete with performance metrics and hard-learned lessons.

Engineering Office Hours

Data-Council-Day-One-2025-83

An AI Council exclusive! Our signature office hours get you dedicated time with speakers for in-depth discussions in a small group setting. Meet your heroes face-to-face, debug your architecture challenges, expand on strategies and discuss the future of AI with the leaders building it.

Hands-On Workshops

Tico-Mendoza-Event-Photography-Austin-Data-Council-22

Build alongside the maintainers of production AI systems. These aren't just tutorials—they're intensive technical sessions where you'll implement real solutions with guidance from the people who architected them.

Events & Networking

Data-Council-2024-Tico-Mendoza--3970

Get access to dozens of exclusive community-curated events where engineering discussions continue in fun, low pressure environments where the real connections happen. From our Community Drinks & Demos night to founder dinners to firesides, you won't want to  miss out!

Past AI Council Talks

Learn from the engineers setting industry standards.
Data-Council-Day-One-2025-231

Billion-Scale Vector Search on Object Storage

Simon Hørup Eskildsen, Co-Founder, Turbopuffer

Mickey Liu, Software Engineer, Notion

the-future-of-data-engineering

The Future of Data Engineering in a Post-AI World

Michelle Ufford Winters, Distinguished MTS - Data & Analytics, eBay (ex- Netflix, GoDaddy, Noteable)

naveen-rao-1

Data Meets Intelligence: Where the Data Infra & AI Stack Converge

Naveen Rao, VP of AI, Databricks

George Mathew, Managing Director, Insight Partners

charles-frye-modal

What Every Data Scientist Needs To Know About GPUs

Charles Frye, Developer Advocate, Modal Labs

2026 Technical Tracks
Inference Systems
This track explores the systems and infrastructure powering real-time multimodal AI—from audio and video to vision, speech and mixed-modal interfaces. Talks focus on accelerating inference, reducing latency, scaling new modalities, and designing next-generation model-serving pipelines. Perfect for engineers building the frontier of interactive AI experiences.
AI Engineering
AI Engineering covers the practical workflows, tools and methodologies for evaluating, monitoring and improving AI systems in production. Topics include eval frameworks, observability stacks, guardrails, prompt testing and reliability engineering. For practitioners who keep AI systems correct, safe and performant in the wild.
AI Security & Safety
Dive deep into security, red-teaming, privacy, safety frameworks and governance for modern AI systems. Expect highly technical sessions on identifying vulnerabilities, mitigating model-level risks and building defensible enterprise AI. The right place for anyone serious about safe, trustworthy and compliant AI deployment.
Agent Infrastructure
Breaks down the architectures, planning systems, memory representations and tool-use loops that make agents work. Talks highlight emerging platforms, orchestration layers and runtime environments for agentic workflows. Ideal for developers building the next generation of intelligent, autonomous systems.
Coding Agents & Autonomous Dev
This track focuses on agents that write, modify and ship software: autonomous PR systems, IDE-integrated agents, code execution sandboxes and fully automated development flows. Sessions explore how coding agents collaborate with humans, reason over large repositories and safely produce production-grade code. For builders at the intersection of AI and software engineering.
Model Systems
Model Systems covers the full lifecycle of model development: pre-training pipelines, fine-tuning strategies, adapters, distillation, small-model architectures and RLE-adjacent techniques. Talks emphasize efficiency, scaling laws, post-training improvements and practical engineering around model quality. Designed for teams building or adapting their own custom models.
Data Engineering & Databases
This track examines the data infrastructure that fuels modern AI—vector databases, retrieval engines, pipelines, feature stores and storage systems. Talks highlight how teams architect data layers for speed, quality and relevance across multimodal and agentic workloads. Essential for everyone who believes great AI starts with great data.
Applied AI
Applied AI explores how intelligence-rich products are built by AI-driven engineering teams. This track focuses on the dual power of Applied AI: solving real-world user problems through the application of AI and radically accelerating the build process itself through use of AI-native tools. We explore how new AI-driven design and coding tools are lowering the barrier to entry, enabling lean teams to tackle complex challenges. This is for the founders and engineers harnessing this advantage to build powerful, intelligence-rich applications that were previously impossible to ship and doing this faster than ever before.
Analytics & Data Science
This track covers experimentation, causal inference, statistical modeling, testing frameworks and data-driven decision systems. Sessions explore how teams measure model performance, validate behavior, run controlled experiments and generate actionable insights. Perfect for data scientists building the analytical backbone of data & AI-driven organizations.
Lightning Talks
A rapid-fire showcase of cutting-edge ideas, tools, experiments, and prototypes across the entire AI Council ecosystem. These fast, high-density sessions surface emerging concepts and bold technical work that doesn’t fit anywhere else. Come ready to learn something new every few minutes.

AI Council 2026 - SAN FRANCISCO

About the Venue

5b75cfce68f07d9bf87b6132c09ab8dd58b3ae9f
San Francisco Marriott Marquis

780 Mission St, San Francisco May 12 - 14, 2026

Reserve a room at the Marriott Marquis with our special rate.

Partnered with AI & Data's Best

What Builders Say

Pedram Navid

Pedram Navid, Developer Education

Anthropic_logo

“AI Council is better than any conference I’ve ever been at because the talks are a higher caliber than anything I’ve ever experienced, and the people here are just second to none.”

Charles Frye

Charles Frye, Developer Advocate

modal

“The people who work on the tools that you use every day, the people you admire, they’re there. They want to share what they’ve been working on.”

ryan-boyd-1

Ryan Boyd, Co-Founder

motherduck-1

“AI Council provides an intimate setting for interacting with other folks in the industry, whereas other conferences you may not know anyone you meet in the hallways.”