MLOps.community  cover image

Cost/Performance Optimization with LLMs [Panel]

MLOps.community

00:00

How to Scale a Summarization System

Cost is an issue, especially with scale. Once we actually started getting that, both the cost of running that and how much we could run this on really becomes a bottleneck. The golden threshold has been, can we make this model run on a single a 100 GP sorry, a 10 GP or on any type of CV requirement? And then if you even move it to CPU, the game changes even further where you can move to billions of items without necessarily even thinking about this.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app