Technical Talks

Nikhil Benesch

CTO | Turbopuffer

Fast AI Search on Object Storage @ >1 Trillion Scale

Data Eng & Databases

This talk goes beyond architecture diagrams to share what actually happens when you operate an agentic search engine on trillions of documents.

We'll dig into how an object storage-native design allows a small team of engineers to manage an AI search engine that scales to:

Peak load of 1M+ writes per second and 30k+ searches per second
1+ trillion documents
5+ PB of logical data
400+ tenants
p90 query latency <100 ms

Topics include:

How using a modern storage architecture decreases COGS by 10x or more
Optimizing traditional vector and FTS indexes for the high latency of object storage
Building search algorithms that are fine-tuned for LLM-initiated searches
A simple rate-limiting technique that provides strong performance isolation in multi-tenant environments
Observability, reliability, and performance lessons learned from production incidents.

Attendees will leave with a concrete understanding of how separating storage from compute-and treating object storage as the primary database changes not only the cost structure, but the entire operational model of large-scale AI search.

CTO

Nikhil Benesch

Turbopuffer

Nikhil Benesch has spent over a decade working on database systems. He was an early engineer at Cockroach Labs, focused on distributed SQL, and went on to serve as CTO of Materialize, where he led development of a streaming database built on incremental computation. He is an active open-source contributor, with projects spanning SQL tooling, systems programming in Rust, and developer utilities. Nikhil is now CTO of turbopuffer, a serverless search engine built on object storage and used by products like Anthropic, Cursor, and Notion.