Mitigate LLM Latency With Distillation Or Offline Use

Use lightweight LLMs or distill large models into student models to avoid inference latency in feeds.
Alternatively run LLMs offline to generate features and use fast traditional models for online ranking.

Snipped by

Teslim

Transcript

chevron_right

Play full episode

chevron_right

Transcript

Episode notes

Demetrios chats with Arpita Vats about how LLMs are shaking up recommender systems. Instead of relying on hand-crafted features and rigid user clusters, LLMs can read between the lines—spotting patterns in user behavior and content like a human would. They cover the perks (less manual setup, smarter insights) and the pain points (latency, high costs), plus how mixing models might be the sweet spot. From timing content perfectly to knowing when traditional methods still win, this episode pulls back the curtain on the future of recommendations.

// Bio

Arpita Vats is a passionate and accomplished researcher in the field of Artificial Intelligence, with a focus on Natural Language Processing, Recommender Systems, and Multimodal AI. With a strong academic foundation and hands-on experience at leading tech companies such as LinkedIn, Meta, and Staples, Arpita has contributed to cutting-edge projects spanning large language models (LLMs), privacy-aware AI, and video content understanding.

She has published impactful research at premier venues and actively serves as a reviewer for top-tier conferences like CVPR, ICLR, and KDD. Arpita’s work bridges academic innovation with industry-scale deployment, making her a sought-after collaborator in the AI research community.

Currently, she is engaged in exploring the alignment and safety of language models, developing robust metrics like the Alignment Quality Index (AQI), and optimizing model behavior across diverse input domains. Her dedication to advancing ethical and scalable AI reflects both in her academic pursuits and professional contributions.

// Related Links

#recommendersystems #LLMs #linkedin

~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~

Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

MLOps Swag/Merch: [https://shop.mlops.community/]

Connect with Demetrios on LinkedIn: /dpbrinkm

Connect with Arpita on LinkedIn: /arpita-v-0a14a422/

Timestamps:

[00:00] Smarter Content Recommendations

[05:19] LLMs: Next-Gen Recommendations

[09:37] Judging LLM Suggestions

[11:38] Old vs New Recommenders