Small Language Models are the Future of Agentic AI

10 snips

Sep 5, 2025

Peter Belcak, an AI research scientist at NVIDIA, discusses his groundbreaking paper on the promise of small language models (SLMs) for agentic AI. He highlights how SLMs can outperform larger models in cost-effectiveness and operational efficiency. Peter explores the transformation process from large models to smaller agents and introduces tools supporting this fine-tuning. He also addresses bias mitigation in data selection and the importance of collaboration in the evolving landscape of AI, paving the way for a more accessible future.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

SLMs Can Replace LLMs For Many Agent Tasks

Small language models (SLMs) can handle many agent errands and match larger models for specific tasks.
SLMs become the natural choice when balancing capability, operational fit, and cost.

INSIGHT

Hardware Favors Smaller Models For Cost Efficiency

Specialized inference chips amplify efficiency gains more for SLMs than for LLMs due to on-chip memory and signal constraints.
That hardware advantage can translate into much lower per-token and per-query costs for SLMs.

INSIGHT

Code Orchestration Often Drives Agent Workflows

Distinguish model agency (ML deciding plans) from code agency (orchestration driving flow).
Many enterprise agents already use code orchestration and only invoke LMs for narrow language errands.

Get the Snipd Podcast app to discover more snips from this episode

Get the app