Replit AI Podcast cover image

03: The Next Generation of LLMs with Jonathan Frankle of MosaicML

Replit AI Podcast

00:00

The Importance of E-Vals in Product Development

The benchmarking stuff has been a little frustrating for us because people are gaming it in all sorts of ways, including training on it. For example, when the 3B that we trained, we did multiple ABA tests and we got a net improvement of 50% over like Salesforce code jam. That was way better. The delta between our sort of between cogent and our fine tuned model. And I'd be curious to see if one plays with the data mix, does that affect the completion rate? Because getting this really tough.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app