Machine Learning Street Talk (MLST) cover image

Prof. Randall Balestriero - LLMs without pretraining and SSL

Machine Learning Street Talk (MLST)

00:00

Rethinking Language Model Training

This chapter examines the trade-offs of large language model pre-training through next token prediction, questioning its resource investment compared to simpler supervised learning approaches. It explores the implications of specialized versus generalized models, emphasizing the importance of task definition and the potential of multi-task learning. The discussion further delves into the complexities of model comprehension, evaluation benchmarks, and philosophical considerations surrounding language generation.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app