
The Information Bottleneck EP11: JEPA with Randall Balestriero
Oct 28, 2025
Randall Balestriero, an assistant professor at Brown University specializing in representation learning, dives deep into Joint Embedding Predictive Architectures (JEPA). He explains how JEPA learns data representations without reconstruction, focusing on meaningful features while compressing irrelevant details. The discussion covers the challenges of model collapse, prediction tasks shaping feature learning, and the implications for AGI benchmarks. Balestriero also shares insights on evaluating JEPA models, the role of latent variables, and the growing opportunity in JEPA research.
AI Snips
Chapters
Transcript
Episode notes
Prediction Over Reconstruction
- JEPA trains embeddings by predicting related views rather than reconstructing inputs.
- This focuses learning on abstract semantics instead of pixel-perfect details.
Selective Compression Benefits Most Tasks
- Compressing irrelevant details improves general downstream performance for most tasks.
- You trade niche capabilities (like counting leaves) for broadly useful representations.
Defend Against Representation Collapse
- Prevent collapse by adding anti-collapse mechanisms like covariance regularizers or teacher-student setups.
- Tune these components carefully because collapse is the dominant failure mode.
