Buy Your Tickets for AI Council 2026!

2026 Talks

Chang She
Chang She
Co-founder & CEO | LanceDB

Trillion is the New Billion: Managing Really Large Multimodal Datasets for AI

  • AI Engineering

Most AI problems are really data problems. AI workloads bring with them ever larger amounts of data from multiple modalities (e.g., text, images, audio, video, sensor data). If you were indexing say, the internet, you need to solve a number of new data infra challenges:

  1. Storing large blobs and avoiding copying them over and over during processing

  2. Dealing with much larger table sizes: trillion rows with a capital T

  3. Supporting workloads like Search, Curation, and Training directly from your dataset instead of having to move data to/from point-solution systems

  4. Dealing with *really* distributed pipelines: what happens when your storage, CPUs, and GPUs are with different clouds / vendors?

In this talk we will dive into detail on why it's challenging to manage trillion scale wide tables with multimodal data. We'll see why existing data infra doesn't support new these data types, workloads, or scale. And we'll do a quick under the hood peek at how Lance format and LanceDB solves these problems at a foundational level. Zooming out, we'll cover how LanceDB fits into the existing data stack alongside Iceberg. Finally, we'll talk through our roadmap and show you the big improvements we're working on in 2026.

Whether you're looking to do large scale search or building the next frontier model, this will help you scale easier, get to production faster, and save on infra cost.

Chang She

Co-founder & CEO

Chang She

LanceDB

Chang She is CEO/Co-founder at Eto Labs building modern data infrastructure for AI. Previously he architected the ML and experimentation stack at TubiTV as VP of Engineering. In the mythical pre-pandemic epoch, Chang was the 2nd major contributor to Pandas, CTO/Co-founder of DataPad, and a recovering financial quant.