3min chapter

MLOps.community  cover image

The Future of Search in the Era of Large Language Models // Saahil Jain // MLOps Podcast #150

MLOps.community

CHAPTER

The Trade-Off Between Latency and Relevance in Generative Models

There is an interesting trade up between latency throughput and relevance. The best try to get a mind trick in in my eyes is what you were talking about when it comes to having somebody fill out a survey or do something while they're waiting because then the perceived latency goes out the window. And I think there's definitely a lot of ways in which you can get a very high relevance so for example you can use simple classifiers like a bird based or distill bird based model. In that case you may get low latency but the relevance may not be as good compared to using say you know an open AI API where you're essentially using GPT 3.5 where you'll probably get better results

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode