AI Tinkerers - "One-Shot"

Build Better AI Agents with RL & Fine-Tuning (Kyle from OpenPipe)

Oct 17, 2025
Kyle Corbett, founder of OpenPipe, shares insights on enhancing AI agents through fine-tuning and reinforcement learning. He reveals how RL can cut error rates by 60% and reduce latency, making AI agents more reliable. Listeners learn about building an effective email search agent that surpasses GPT-3.5, using the Enron dataset for realistic training. Kyle also discusses the importance of designing nuanced reward functions and highlights ideal use cases for RL fine-tuning, including real-time voice assistants and high-volume applications.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Browser Agent Failures Sparked Fine-Tuning Focus

  • Kyle and his brother started building a browser agent before GPT-4 and hit reliability, cost, and latency issues.
  • Those problems led them to focus on fine-tuning and RL to improve agent performance.
INSIGHT

Fine-Tuning Dramatically Improves Agent Accuracy

  • Fine-tuning an agent for an email search task raised accuracy from ~90% to 96%, cutting error rate by ~60%.
  • Small, targeted model fine-tuning can outperform larger off-the-shelf models on domain tasks.
INSIGHT

Fine-Tuned Smaller Models Enable On-Prem Privacy

  • Small models (7–14B) fine-tuned for a task can run on-prem and achieve significant latency and cost advantages.
  • That makes them attractive for sensitive or regulated data like email.
Get the Snipd Podcast app to discover more snips from this episode
Get the app