2min chapter

Replit AI Podcast cover image

03: The Next Generation of LLMs with Jonathan Frankle of MosaicML

Replit AI Podcast

CHAPTER

The Importance of E-Vals in Product Development

The benchmarking stuff has been a little frustrating for us because people are gaming it in all sorts of ways, including training on it. For example, when the 3B that we trained, we did multiple ABA tests and we got a net improvement of 50% over like Salesforce code jam. That was way better. The delta between our sort of between cogent and our fine tuned model. And I'd be curious to see if one plays with the data mix, does that affect the completion rate? Because getting this really tough.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode