

The Dawn of Dynamic AI: RFT Comes Online, w/ Predibase CEO Dev Rishi, from Inference by Turing Post
136 snips Jul 16, 2025
Dev Rishi, CEO and co-founder of Predibase, dives into the revolutionary shift from static to continuously learning AI systems. He explains how reinforcement learning can adapt via ongoing user feedback, showcasing its potential in healthcare and finance. Rishi also discusses the challenges of implementing these dynamic models, like reward hacking and maintaining quality. The conversation highlights the possibilities of 'practical specialized intelligence' as a more stable alternative to traditional AGI, and how it can reshape various economic niches.
AI Snips
Chapters
Books
Transcript
Episode notes
Reinforcement Fine-Tuning Impact
- Reinforcement fine-tuning (RFT) enables improving models with small data via reward signals instead of labeled data.
- This method will shift from one-off tuning to continuous learning inside production feedback loops.
Healthcare Uses Continuous Learning
- Some healthcare companies use a feedback pipeline combining expert annotations and model judges to improve AI assistants in production.
- This early real-world implementation shows dynamic learning from user and expert feedback is feasible today.
Build Feedback Data Pipelines
- Collect prompts and responses automatically from production to build feedback datasets.
- Use small amounts of user feedback to fine-tune models continuously with techniques like Direct Preference Optimization (DPO).