MLOps.community  cover image

Python Power: How Daft Embeds Models and Revolutionizes Data Processing // Sammy Sidhu // #165

MLOps.community

00:00

How to Distilate a Large Language Model

There's this great paper from Alex Radner and a lot of other people that I can't remember who, but he was talking about distilled step by step. It's basically distilling the model. And so it makes it much easier for you to get that distilled model and train it with less data. Back in my day, distillation was kind of dumb. What you would do is you would train a big model, and you would just train the essentially, you would use that output as the ground truth for a smaller model. But now they're asking for this chain of thought reasoning.

Play episode from 22:17
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app