Latent Space: The AI Engineer Podcast cover image

MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

Latent Space: The AI Engineer Podcast

NOTE

Challenges in Evaluating Large Language Models

There is no real evidence that certain types of training data, such as code models or Wikipedia, lead to better reasoning in language models/nDifferent data mixes have varying levels of effectiveness, with C4 being particularly successful despite its questionable pre-processing methods/nEvaluation of large language models is incredibly difficult and current metrics do not fully capture their practical performance

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner