MLOps.community  cover image

Cost/Performance Optimization with LLMs [Panel]

MLOps.community

00:00

How to Reduce Cost With Octo

Octo now offers a fully automated platform for broader users to deploy their own model. You can live with hardware that actually hits your latest interpret requirements. structured printing and knowledge distillation where you can take the large model and then just remove portions of it until it fits in whatever systems you have. And then second choose the right silicon for the one that has the lowest cost possible and hits your performance requirements.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app