Machine Learning Street Talk (MLST) cover image

Prof. Randall Balestriero - LLMs without pretraining and SSL

Machine Learning Street Talk (MLST)

00:00

Rethinking Language Model Training

This chapter examines the trade-offs of large language model pre-training through next token prediction, questioning its resource investment compared to simpler supervised learning approaches. It explores the implications of specialized versus generalized models, emphasizing the importance of task definition and the potential of multi-task learning. The discussion further delves into the complexities of model comprehension, evaluation benchmarks, and philosophical considerations surrounding language generation.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner