

Interconnects
Nathan Lambert
Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai
Episodes
Mentioned books

Feb 16, 2024 • 9min
Releases! OpenAI’s Sora for video, Gemini 1.5's infinite context, and a secret Mistral model
Emergency blog! Three things you need to know from the ML world that arrived yesterday.This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/sora-gemini-and-mistral-next0:00 OpenAI's Sora for video, Gemini 1.5, and a secret Mistral model0:53 Sora: OpenAI's text-to-video model4:59 Gemini 1.5: Google's effectively infinite context length8:01 Mistral-next: Another funny release methodFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_015.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_023.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_026.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_036.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 14, 2024 • 8min
Why reward models are still key to understanding alignment
In an era dominated by direct preference optimization and LLMasajudge, why do we still need a model to output only a scalar reward?This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: In an era dominated by direct preference optimization and LLM-as-a-judge, why do we still need a model to output only a scalar reward?Podcast figures:Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_004.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_009.png0:00 Why reward models are still key to understanding alignment This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Feb 7, 2024 • 10min
Alignment-as-a-Service: Scale AI vs. the new guys
This podcast discusses the challenges faced by ScaleAI, a startup providing data services for reinforcement learning from human feedback (RLHF). It explores ScaleAI's revenue growth, partnership with major labs, and defense arm. The podcast also explores the concept of scaling alignment as a service through AI feedback alignment and potential business opportunities in RLHF.

Feb 1, 2024 • 9min
Open Language Models (OLMos) and the LLM landscape
A small model at the beginning of big changes.This is AI generated audio with Python and 11LabsSource code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/olmo0:00 Open Language Models (OLMos) and the LLM landscape6:24 Thought experiments7:51 The LLM landscape heading into 2024Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmo/img_010.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jan 29, 2024 • 19min
Model merging lessons in The Waifu Research Department
Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.When what seems like pure LLM black magic is actually supported by the literature.This is AI generated audio with Python and 11LabsSource code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/model-merging00:00 Model merging lessons in The Waifu Research Department02:21 How and why does model merging work?07:13 Aside: merging vs. ensembles vs. mixture of experts08:21 Why are people doing this?11:22 Tools & Links11:51 Brief (visual) literature review12:07 Full model merging and recent methods15:55 Weight averaging during pretraining17:18 LoRA merging17:53 More backgroundFigure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.pngFigure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.pngFigure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.pngFigure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.pngFigure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.pngFigure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.pngFigure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.pngFigure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.pngFigure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.pngFigure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.pngFigure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.pngFigure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Jan 24, 2024 • 10min
Local LLMs, some facts some fiction
The podcast discusses the benefits of local LLMs, strategies to optimize latency, and the integration of LLMs into consumer devices. It explores the role of local models in machine learning for personalization and optimization for inference. The influence of ML labs and their larger ambitions on the future is also discussed, highlighting Alama's popularity and Meta's build-out plans and open-source strategy.

Jan 17, 2024 • 8min
Multimodal blogging: My AI tools to expand your audience
A podcast discusses multimodal blogging, AI tools for content creation, and expanding audience reach. The speaker shares their workflow in building a suite of tools for bloggers and explores the use of AI tools like Passport, audio conditioning, and voice cloning. They also discuss future advancements in text to video models and automation in research talks and video creation.

Jan 10, 2024 • 16min
Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions
This podcast discusses recent developments in the multimodal space, including the Unified IO 2 model, collecting preference data for images, LLaVA-RLHF experiments, and challenges in multimodal RLHF. They explore the architecture and challenges of multimodal models, the potential of GPT for V in multimodal RLHF, and the use of RLHF technique in multimodal models. They also discuss the importance of clearer terminology and the adoption of synthetic data in this context.

Jan 5, 2024 • 14min
Where 2024’s “open GPT4” can’t match OpenAI’s
And why the comparisons don't really matter. Repeated patterns in the race for reproducing ChatGPT, another year of evaluation crises, and people who will take awesome news too far.This is AI generated audio with Python and 11LabsSource code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/open-gpt4-limitations00:00 Where 2024's "open GPT4" can't match OpenAI's03:19 Models vs. products04:51 RLHF progress: Revisiting Llama 2's release and potential in 202408:30 Smaller scale open RLHF10:33 Opportunities12:24 Commentary This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Dec 21, 2023 • 36min
Interviewing Tri Dao and Michael Poli of Together AI on the future of LLM architectures
Tri Dao, an incoming professor at Princeton and Chief Scientist at Together AI, joins Michael Poli, a Stanford PhD graduate and research scientist at Together AI. They dive into why traditional attention mechanisms may not scale effectively and introduce innovative models like Striped Hyena and Mamba. The duo discusses hardware optimization for these architectures and predicts exciting developments in AI for 2024, challenging the dominance of current transformer models. Their insights reflect a transformative wave in machine learning.