Sustainable prompt engineering is a challenge. Every time we change a model, update the distribution of input data, or extend a pipeline, we often find ourselves manually editing LLM prompts without fully understanding the consequences of doing so. DSPy takes steps towards addressing this by treating prompts as learnable parameters, not hand-crafted strings.
This workshop introduces DSPy's building blocks - signatures as semantic contracts, modules as composable pipeline steps, and optimizers as the training loop - then demonstrates how they combine to produce pipelines that improve themselves systematically.
During the workshop, we will demo a pattern that is still underused in practice: optimize the evaluator first, then let the evaluator drive the optimizer. Starting from a retrieval and generation pipeline, we’ll explore how a small set of human-generated labels can be used to align an LLM judge to expert judgment. DSPy tools can then use that aligned judge’s feedback signal to iteratively restructure and improve the pipeline itself.
The workflow runs on the Databricks technology stack: We’ll use Databricks Foundation Model endpoints as the LLM backend, instrument everything with MLflow tracing, and inspect the before-and-after prompts to build intuition for what optimization actually does under the hood. We'll close with a look at recent Databricks research on program evolution and meta-optimization, and the benchmark results showing why chaining optimizers compounds their individual gains.
All notebooks will be shared after the session.
Bio Coming Soon!