Technical Talks
Fast AI Search on Object Storage @ >1 Trillion Scale
Missing value detected...
Video will be populated after the conference
- Data Eng & Databases
This talk goes beyond architecture diagrams to share what actually happens when you operate an agentic search engine on trillions of documents.
We'll dig into how an object storage-native design allows a small team of engineers to manage an AI search engine that scales to:
- Peak load of 1M+ writes per second and 30k+ searches per second
- 1+ trillion documents
- 5+ PB of logical data
- 400+ tenants
- p90 query latency <100 ms
- How using a modern storage architecture decreases COGS by 10x or more
- Optimizing traditional vector and FTS indexes for the high latency of object storage
- Building search algorithms that are fine-tuned for LLM-initiated searches
- A simple rate-limiting technique that provides strong performance isolation in multi-tenant environments
- Observability, reliability, and performance lessons learned from production incidents.
Attendees will leave with a concrete understanding of how separating storage from compute-and treating object storage as the primary database changes not only the cost structure, but the entire operational model of large-scale AI search.
CTO
Nikhil Benesch
Turbopuffer
Nikhil Benesch has spent over a decade working on database systems. He was an early engineer at Cockroach Labs, focused on distributed SQL, and went on to serve as CTO of Materialize, where he led development of a streaming database built on incremental computation. He is an active open-source contributor, with projects spanning SQL tooling, systems programming in Rust, and developer utilities. Nikhil is now CTO of turbopuffer, a serverless search engine built on object storage and used by products like Anthropic, Cursor, and Notion.
Discover the data-driven foundations powering today's AI breakthroughs. Join leading minds as we explore both cutting-edge AI and the infrastructure behind it by subscribing to our newsletter today!