

The Information Bottleneck
Ravid Shwartz-Ziv & Allen Roush
Two AI Researchers - Ravid Shwartz Ziv, and Allen Roush, discuss the latest trends, news, and research within Generative AI, LLMs, GPUs, and Cloud Systems.
Episodes
Mentioned books

Dec 1, 2025 • 1h 45min
EP18: AI Robotics
In this episode, we hosted Judah Goldfeder, a PhD candidate at Columbia University and student researcher at Google, to discuss robotics, reproducibility in ML, and smart buildings.Key topics covered:Robotics challenges: We discussed why robotics remains harder than many expected, compared to LLMs. The real world is unpredictable and unforgiving, and mistakes have physical consequences. Sim-to-real transfer remains a major bottleneck because simulators are tedious to configure accurately for each robot and environment. Unlike text, robotics lacks foundation models, partly due to limited clean, annotated datasets and the difficulty of collecting diverse real-world data.Reproducibility crisis: We discussed how self-reported benchmarks can lead to p-hacking and irreproducible results. Centralized evaluation systems (such as Kaggle or ImageNet challenges), where researchers submit algorithms for testing on hidden test sets, seem to drive faster progress.Smart buildings: Judah's work at Google focuses on using ML to optimize HVAC systems, potentially reducing energy costs and carbon emissions significantly. The challenge is that every building is different. It makes the simulation configuration extremely labor-intensive. Generative AI could help by automating the process of converting floor plans or images into accurate building simulations.Links:Judah website - https://judahgoldfeder.com/Music:"Kid Kodi" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0."Palms Down" — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.Changes: trimmed

Nov 24, 2025 • 1h 6min
EP17: RL with Will Brown
In this conversation with Will Brown, research lead at Prime Intellect specializing in reinforcement learning (RL) and multi-agent systems, they explore the foundations and practical applications of RL. Will shares insights into the challenges RL faces in LLMs, emphasizing the importance of online sampling and reward models. He discusses multi-agent dynamics, optimization techniques, and the role of game theory in AI development. The discussion also highlights the significance of intermediate results and the future directions for RL in various applications.

Nov 17, 2025 • 59min
EP16: AI News and Papers
Dive into the intriguing world of AI as the hosts dissect the quirks of conference review dynamics and the innovation of Kimi K2 thinking. Discover Google's latest TPU advancements and their competition with NVIDIA. The importance of real-world data in robotics takes center stage, while the concept of Chain of Thought Hijacking raises alarms about model vulnerabilities. Simplifying reinforcement learning with JustRL presents new possibilities, and the Cosmos project sparks curiosity around AI-driven scientific discovery.

Nov 13, 2025 • 1h 23min
EP15: The Information Bottleneck and Scaling Laws with Alex Alemi
In this discussion, Alex Alemi—a prominent AI researcher from Anthropic, formerly with Google Brain and Disney—delves into the concept of the information bottleneck. He explains how it captures the essential aspects of data while avoiding overfitting. Alemi also highlights scaling laws, revealing how smaller experiments can forecast larger behaviors in AI. He offers insights on the importance of compression in understanding models and challenges researchers to pursue ambitious questions that address broader implications for society, such as job disruption.

Nov 10, 2025 • 57min
EP14: AI News and Papers
Explore the intricate dynamics of AI in healthcare, where GPT-5's fragility raises pressing concerns about reliability and education. Discover Stanford's innovative Cartridges approach, which aims to cut down the hefty computational costs of long-context models. Delve into the transformative potential of Continuous Autoregressive Language Models and learn about practical strategies from the Smol Training Playbook. Join in the debate on benchmarks and dataset challenges in this fast-evolving field!

Nov 7, 2025 • 1h 21min
EP13: Recurrent-Depth Models and Latent Reasoning with Jonas Geiping
In this episode, we host Jonas Geiping from ELLIS Institute & Max-Planck Institute for Intelligent Systems, Tübingen AI Center, Germany. We talked about his broad research on Recurrent-Depth Models and latent reasoning in large language models (LLMs). We talked about what these models can and can't do, what are the challenges and next breakthroughs in the field, world models, and the future of developing better models. We also talked about safety and interpretability, and the role of scaling laws in AI development.Chapters00:00 Introduction and Guest Introduction01:03 Peer Review in Preprint Servers06:57 New Developments in Coding Models09:34 Open Source Models in Europe11:00 Dynamic Layers in LLMs26:05 Training Playbook Insights30:05 Recurrent Depth Models and Reasoning Tasks43:59 Exploring Recursive Reasoning Models46:46 The Role of World Models in AI48:41 Innovations in AI Training and Simulation50:39 The Promise of Recurrent Depth Models52:34 Navigating the Future of AI Algorithms54:44 The Bitter Lesson of AI Development59:11 Advising the Next Generation of Researchers01:06:42 Safety and Interpretability in AI Models01:10:46 Scaling Laws and Their Implications01:16:19 The Role of PhDs in AI ResearchLinks and paper:Jonas' website - https://jonasgeiping.github.io/Scaling up test-time compute with latent reasoning: A recurrent depth approach - https://arxiv.org/abs/2502.05171The Smol Training Playbook: The Secrets to Building World-Class LLMs - https://huggingface.co/spaces/HuggingFaceTB/smol-training-playbookVaultGemma: A Differentially Private Gemma Model - https://arxiv.org/abs/2510.15001Music:“Kid Kodi” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.“Palms Down” — Blue Dot Sessions — via Free Music Archive — CC BY-NC 4.0.Changes: trimmed

Nov 3, 2025 • 58min
EP12: Adversarial attacks and compression with Jack Morris
In this episode of the Information Bottleneck Podcast, we host Jack Morris, a PhD student at Cornell, to discuss adversarial examples (Jack created TextAttack, the first software package for LLM jailbreaking), the Platonic representation hypothesis, the implications of inversion techniques, and the role of compression in language models.Links:Jack's Website - https://jxmo.io/TextAttack - https://arxiv.org/abs/2005.05909How much do language models memorize? https://arxiv.org/abs/2505.24832DeepSeek OCR - https://www.arxiv.org/abs/2510.18234Chapters:00:00 Introduction and AI News Highlights04:53 The Importance of Fine-Tuning Models10:01 Challenges in Open Source AI Models14:34 The Future of Model Scaling and Sparsity19:39 Exploring Model Routing and User Experience24:34 Jack's Research: Text Attack and Adversarial Examples29:33 The Platonic Representation Hypothesis34:23 Implications of Inversion and Security in AI39:20 The Role of Compression in Language Models44:10 Future Directions in AI Research and Personalization

Oct 28, 2025 • 1h 18min
EP11: JEPA with Randall Balestriero
Randall Balestriero, an assistant professor at Brown University specializing in representation learning, dives deep into Joint Embedding Predictive Architectures (JEPA). He explains how JEPA learns data representations without reconstruction, focusing on meaningful features while compressing irrelevant details. The discussion covers the challenges of model collapse, prediction tasks shaping feature learning, and the implications for AGI benchmarks. Balestriero also shares insights on evaluating JEPA models, the role of latent variables, and the growing opportunity in JEPA research.

Oct 20, 2025 • 1h 18min
EP10: Geometric Deep Learning with Michael Bronstein
In this episode, we talked with Michael Bronstein, a professor of AI at the University of Oxford and a scientific director at AITHYRA, about the fascinating world of geometric deep learning. We explored how understanding the geometric structures in data can enhance the efficiency and accuracy of AI models. Michael shared insights on the limitations of small neural networks and the ongoing debate about the role of scaling in AI. We also talked about the future in scientific discovery, and the potential impact on fields like drug design and mathematics

Oct 13, 2025 • 1h 8min
EP9: AI in Natural Sciences with Tal Kachman
In this episode we host Tal Kachman, an assistant professor at Radboud University, to explore the fascinating intersection of artificial intelligence and natural sciences. Prof. Kachman's research focuses on multiagent interaction, complex systems, and reinforcement learning. We dive deep into how AI is revolutionizing materials discovery, chemical dynamics modeling, and experimental design through self-driving laboratories. Prof. Kachman shares insights on the challenges of integrating physics and chemistry with AI systems, the critical role of high-throughput experimentation in accelerating scientific discovery, and the transformative potential of generative models to unlock new materials and functionalities.


