Buy Early Bird Tickets for AI Council 2026!

Technical Talks

Chris Alexiuk
Chris Alexiuk
Product Research Engineer | NVIDIA

RLVR in Practice: From Synthetic Data to GRPO

video
Missing value detected...
Video will be populated after the conference

  • Model Systems

Reinforcement Learning from Verifiable Rewards (RLVR) is increasingly common in post-training pipelines, but the practical details are often glossed over. How do you design reward functions that programmatically verify model outputs? What makes synthetic training data effective? How do you build a custom RL environment that doesn't silently break your training?

Chris Alexiuk

Product Research Engineer

Chris Alexiuk

NVIDIA

Bio Coming Soon