Deep Papers

Small Language Models are the Future of Agentic AI

Sep 5, 2025
Peter Belcak, an AI research scientist at NVIDIA, discusses his groundbreaking paper on the promise of small language models (SLMs) for agentic AI. He highlights how SLMs can outperform larger models in cost-effectiveness and operational efficiency. Peter explores the transformation process from large models to smaller agents and introduces tools supporting this fine-tuning. He also addresses bias mitigation in data selection and the importance of collaboration in the evolving landscape of AI, paving the way for a more accessible future.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

SLMs Can Replace LLMs For Many Agent Tasks

  • Small language models (SLMs) can handle many agent errands and match larger models for specific tasks.
  • SLMs become the natural choice when balancing capability, operational fit, and cost.
INSIGHT

Hardware Favors Smaller Models For Cost Efficiency

  • Specialized inference chips amplify efficiency gains more for SLMs than for LLMs due to on-chip memory and signal constraints.
  • That hardware advantage can translate into much lower per-token and per-query costs for SLMs.
INSIGHT

Code Orchestration Often Drives Agent Workflows

  • Distinguish model agency (ML deciding plans) from code agency (orchestration driving flow).
  • Many enterprise agents already use code orchestration and only invoke LMs for narrow language errands.
Get the Snipd Podcast app to discover more snips from this episode
Get the app