Latent Space: The AI Engineer Podcast cover image

MPT-7B and The Beginning of Context=Infinity — with Jonathan Frankle and Abhinav Venigalla of MosaicML

Latent Space: The AI Engineer Podcast

00:00

Innovations in AI Model Training

This chapter explores Cerebras' pioneering work in wafer-scale computing and the evolution of software for machine learning. It highlights the development of the MPT 7B model and the complexities of data selection for training large language models. The conversation also addresses evaluation challenges and innovations in model training, showcasing the interplay of performance, speed, and data diversity.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app