
Machine Learning Street Talk (MLST)
Prof. Randall Balestriero - LLMs without pretraining and SSL
Apr 23, 2025
Randall Balestriero, an AI researcher renowned for his work on self-supervised learning and geographic bias, explores fascinating findings in AI training. He reveals that large language models can perform well even without extensive pre-training. Randall also highlights the similarities between self-supervised and supervised learning, emphasizing their potential for improvement. Additionally, he discusses biases in climate models, demonstrating the risks of relying on their predictions, particularly for vulnerable regions, which has significant policy implications.
34:30
Episode guests
AI Summary
AI Chapters
Episode notes
Podcast summary created with Snipd AI
Quick takeaways
- Large language models can achieve competitive performance without pre-training, challenging the need for extensive datasets in certain tasks.
- Bias and fairness in AI highlight the models' limitations in representing diverse cultural perspectives, necessitating careful consideration in training methods.
Deep dives
Surprising Stability of Overparameterized Models
Empirical findings reveal that significantly overparameterized models, such as those with 7 billion parameters, exhibit stable training characteristics without aggressive overfitting, even when trained with limited data. For instance, using a dataset of merely 20,000 samples, these models perform comparably to pre-trained counterparts when subjected to tasks like sentiment analysis. This challenges existing beliefs about the necessity of extensive pre-training for large models, suggesting that even simpler models trained from scratch can yield acceptable performance. The scale of the model appears to contribute to its ability to avoid overfitting, drawing parallels to insights previously noted in computer vision models.
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.