Software Engineering Radio - the podcast for professional software developers cover image

Software Engineering Radio - the podcast for professional software developers

SE Radio 661: Sunil Mallya on Small Language Models

Mar 25, 2025
Sunil Mallya, Co-founder and CTO of Flip AI, shares his expertise on small language models (SLMs) versus large language models (LLMs). He delves into their differences, revealing how SLMs can be more efficient and accurate for specific tasks. Sunil highlights the importance of domain-specific training datasets and discusses recent advancements like the DeepSeek R1 that show smaller models outperforming larger ones in particular contexts. He also touches on the evolving landscape of model deployment and how organizations can optimize performance while managing costs.
59:28

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Small language models (SLMs) prioritize specialization and efficiency, making them preferable for niche applications over larger language models (LLMs).
  • Training SLMs effectively requires high-quality domain-specific data, which can yield accurate results with smaller datasets compared to LLMs.

Deep dives

Understanding Small Language Models (SLMs)

Small language models (SLMs) are defined not solely by their size but by their practicality and resource requirements. These models can operate effectively without the extensive GPU resources that larger models, like large language models (LLMs), typically demand. As of early 2025, a 10 billion parameter model with a maximum context length of 10,000 words and a one-second inference latency exemplifies what constitutes an SLM. The definition of what is considered 'large' evolves over time, influenced by advancements in underlying hardware and the rapid development of AI models.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner