Latent Space: The AI Engineer Podcast cover image

MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

Latent Space: The AI Engineer Podcast

00:00

Innovations in AI Model Training

This chapter explores Cerebras' pioneering work in wafer-scale computing and the evolution of software for machine learning. It highlights the development of the MPT 7B model and the complexities of data selection for training large language models. The conversation also addresses evaluation challenges and innovations in model training, showcasing the interplay of performance, speed, and data diversity.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app