Technical Talks
RLVR in Practice: From Synthetic Data to GRPO
Missing value detected...
Video will be populated after the conference
- Model Systems
Reinforcement Learning from Verifiable Rewards (RLVR) is increasingly common in post-training pipelines, but the practical details are often glossed over. How do you design reward functions that programmatically verify model outputs? What makes synthetic training data effective? How do you build a custom RL environment that doesn't silently break your training?
Product Research Engineer
Chris Alexiuk
NVIDIA
Bio Coming Soon
Discover the data-driven foundations powering today's AI breakthroughs. Join leading minds as we explore both cutting-edge AI and the infrastructure behind it by subscribing to our newsletter today!