The Inside View cover image

The Inside View

Latest episodes

undefined
Sep 8, 2023 • 2h 54min

Joscha Bach on how to stop worrying and love AI

AI researcher/cognitive scientist Joscha Bach discusses various topics including the relationship between nuclear weapons and AI x-risk, global warming, AI regulation, delaying technology, humanity's locust mode, scenarios for delaying AI, and the dangers of AI regulation.
undefined
Aug 11, 2023 • 23min

Erik Jones on Automatically Auditing Large Language Models

Erik Jones, a PhD candidate at Berkeley, examines how to enhance the safety and alignment of large language models. He discusses his innovative paper on automatically auditing these models, exploring the vulnerabilities they face from adversarial attacks. Erik shares insights on the importance of discrete optimization and how it can reveal hidden model behaviors. He also delves into the implications of using language models for sensitive topics and the need for automated auditing methods to ensure reliability and robustness in AI systems.
undefined
Aug 9, 2023 • 12min

Dylan Patel on the GPU Shortage, Nvidia and the Deep Learning Supply Chain

Dylan Patel is Chief Analyst at SemiAnalysis a boutique semiconductor research and consulting firm specializing in the semiconductor supply chain from chemical inputs to fabs to design IP and strategy. The SemiAnalysis substack has ~50,000 subscribers and is the second biggest tech substack in the world. In this interview we discuss the current GPU shortage, why hardware is a multi-month process, the deep learning hardware supply chain and Nvidia's strategy. Youtube: https://youtu.be/VItz2oEq5pA Transcript: https://theinsideview.ai/dylan
undefined
Aug 4, 2023 • 4min

Tony Wang on Beating Superhuman Go AIs with Advesarial Policies

Tony is a PhD student at MIT, and author of "Advesarial Policies Beat Superhuman Go AIs", accepted as Oral at the International Conference on Machine Learning (ICML). Paper: https://arxiv.org/abs/2211.00241 Youtube: https://youtu.be/Tip1Ztjd-so
undefined
Aug 1, 2023 • 25min

David Bau on Editing Facts in GPT, AI Safety and Interpretability

David Bau is an Assistant Professor studying the structure and interpretation of deep networks, and the co-author on "Locating and Editing Factual Associations in GPT" which introduced Rank-One Model Editing (ROME), a method that allows users to alter the weights of a GPT model, for instance by forcing it to output that the Eiffel Tower is in Rome. David is a leading researcher in interpretability, with an interest in how this could help AI Safety. The main thesis of David's lab is that understanding the rich internal structure of deep networks is a grand and fundamental research question with many practical implications, and they aim to lay the groundwork for human-AI collaborative software engineering, where humans and machine-learned models both teach and learn from each other. David's lab: https://baulab.info/ Patron: https://www.patreon.com/theinsideview Twitter: https://twitter.com/MichaelTrazzi Website: https://theinsideview.ai TOC [00:00] Intro [01:16] Interpretability [02:27] AI Safety, Out of Domain behavior [04:23] It's difficult to predict which AI application might become dangerous or impactful [06:00] ROME / Locating and Editing Factual Associations in GPT [13:04] Background story for the ROME paper [15:41] Twitter Q: where does key value abstraction break down in LLMs? [19:03] Twitter Q: what are the tradeoffs in studying the largest models? [20:22] Twitter Q: are there competitive and cleaner architectures than the transformer? [21:15] Twitter Q: is decoder-only a contributor to the messiness? or is time-dependence beneficial? [22:45] Twitter Q: how could ROME deal with superposition? [23:30] Twitter Q: where is the Eiffel tower actually located?
undefined
Jul 26, 2023 • 20min

Alexander Pan on the MACHIAVELLI benchmark

Alexander Pan, a 1st-year student at Berkeley, discusses the MACHIAVELLI benchmark paper on measuring trade-offs between rewards and ethical behavior in AI agents. They explore topics like creating artificial conscience in language models, balancing rewards with morality, and addressing AI risks like negative impacts on political discourse and malware development.
undefined
Jul 24, 2023 • 18min

Vincent Weisser on Funding AI Alignment Research

Vincent is currently spending his time supporting AI alignment efforts, as well as investing across AI, semi, energy, crypto, bio and deeptech. His mission is to improve science, augment human capabilities, have a positive impact, help reduce existential risks and extend healthy human lifespan. Youtube: https://youtu.be/weRoJ8KN2f0 Outline (00:00) Why Is Vincent Excited About the ICML Conference (01:30) Vincent's Background In AI Safety (02:23) Funding AI Alignment Through Crypto, Bankless (03:35) Taxes When Donating Crypto (04:09) Alignment Efforts Vincent Is Excited About (04:39) Is AI Alignment Currently Funding Cunstrained (06:23) Bottlnecks In Evaluating Grants, Diversity Of Funding Sources (07:22) Impact Markets, Retroactive Funding (08:57) On The Difficulty Of Evaluating Uncertain AI Alignment Projects (10:05) Funding Academic Labs To Transition To Alignment Work (11:54) People Should Act On Their Beliefs And Make Stuff Happen (13:15) Vincent's Model: Don't Always Assume Someone Else Will Fund This (13:49) How To Be Agentic: Start Donating, Spread The Message, AI Safety Fundamentals (15:00) You Wouldn't Start Invest With 1M Dollars, Same With Donating (16:13) Is Vincent Acting As If Timelines Were Short And The Risk Was High (17:10) Is Vincent Optimistic When He Wakes Up In The Morning
undefined
Jul 19, 2023 • 1h 17min

[JUNE 2022] Aran Komatsuzaki on Scaling, GPT-J and Alignment

Aran Komatsuzaki is a ML PhD student at GaTech and lead researcher at EleutherAI where he was one of the authors on GPT-J. In June 2022 we recorded an episode on scaling following up on the first Ethan Caballero episode (where we mentioned Aran as an influence on how Ethan started thinking about scaling). Note: For some reason I procrastinated on editing the podcast, then had a lot of in-person podcasts so I left this one as something to edit later, until the date was so distant from June 2022 that I thought publishing did not make sense anymore. In July 2023 I'm trying that "one video a day" challenge (well I missed some days but I'm trying to get back on track) so I thought it made sense to release it anyway, and after a second watch it's somehow interesting to see how excited Aran was about InstructGPT, which turned to be quite useful for things like ChatGPT. Outline (00:00) intro (00:53) the legend of the two AKs, Aran's arXiv reading routine (04:14) why Aran expects Alignment to be the same as some other ML problems (05:44) what Aran means when he says "AGI" (10:24) what Aran means by "human-level at doing ML research" (11:31) software improvement happening before hardware improvement (13:00) is scale all we need? (15:25) how "Scaling Laws for Neural Language Models" changed the process of doing experiments (16:22) how Aran scale-pilled Ethan (18:46) why Aran was already scale-pilled before GPT-2 (20:12) Aran's 2019 scaling paper: "One epoch is all you need" (25:43) Aran's June 2022 interest: T0 and InstructGPT (31:33) Encoder-Decoder performs better than encoder if multi-task-finetuned (33:30) Why the Scaling Law might be different for T0-like models (37:15) The Story Behind GPT-J (41:40) Hyperparameters and architecture changes in GPT-J (43:56) GPT-J's throughput (47:17) 5 weeks of training using 256 of TPU cores (50:34) did publishing GPT-J accelerate timelines? (55:39) how Aran thinks about Alignment, defining Alignment (58:19) in practice: improving benchmarks, but deception is still a problem (1:00:49) main difficulties in evaluating language models (1:05:07) how Aran sees the future: AIs aligning AIs, merging with AIs, Aran's takeoff scenario (1:10:09) what Aran thinks we should do given how he sees the next decade (1:12:34) regulating access to AGI (1:14:50) what might happen: preventing some AI authoritarian regime (1:15:42) conclusion, where to find Aran
undefined
Jul 16, 2023 • 1h 30min

Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAI

Curtis, also known on the internet as AI_WAIFU, is the head of Alignment at EleutherAI. In this episode we discuss the massive orders of H100s from different actors, why he thinks AGI is 4-5 years away, why he thinks we're 90% "toast", his comment on Eliezer Yudkwosky's Death with Dignity, and what kind of Alignment projects is currently going on at EleutherAI, especially a project with Markov chains and the Alignment test project that he is currently leading. Youtube: https://www.youtube.com/watch?v=9s3XctQOgew Transcript: https://theinsideview.ai/curtis Death with Dignity: https://www.lesswrong.com/posts/j9Q8bRmwCgXRYAgcJ/miri-announces-new-death-with-dignity-strategy Alignment Minetest: https://www.eleuther.ai/projects/alignment-minetest Alignment Minetest update: https://blog.eleuther.ai/minetester-intro/ OUTLINE (00:00) Highlights / Intro (00:50) The Fuck That Noise Comment On Death With Dignity (10:28) Th Probability of Doom Is 90% (12:44) Best Counterarguments For His High P(doom) (14:41) Compute And Model Size Required For A Dangerous Model (17:59) Details For Curtis' Model Of Compute Required (21:23) Why This Estimate Of Compute Required Might Be Wrong, Ajeya Cotra's Transformative AI report (29:00) Curtis' Median For AGI Is Around 2028, Used To Be 2027 (30:50) How Curtis Approaches Life With Short Timelines And High P(Doom) (35:27) Takeoff Speeds—The Software view vs. The Hardware View (39:57) Nvidia's 400k H100 rolling down the assembly line, AIs soon to be unleashed on their own source code (41:04) Could We Get A Fast Takeoff By Fuly Automating AI Research With More Compute (46:00) The Entire World (Tech Companies, Governments, Militaries) Is Noticing New AI Capabilities That They Don't Have (47:57) Open-source vs. Close source policies. Mundane vs. Apocalyptic considerations. (53:25) Curtis' background, from teaching himself deep learning to EleutherAI (55:51) Alignment Project At EleutherAI: Markov Chain and Language Models (01:02:15) Research Philosophy at EleutherAI: Pursuing Useful Projects, Multingual, Discord, Logistics (01:07:38) Alignment MineTest: Links To Alignmnet, Embedded Agency, Wireheading (01:15:30) Next steps for Alignment Mine Test: focusing on model-based RL (01:17:07) Training On Human Data & Using an Updated Gym Environment With Human APIs (01:19:20) Model Used, Not Observing Symmetry (01:21:58) Another goal of Alignment Mine Test: Study Corrigibility (01:28:26) People ordering H100s Are Aware Of Other People Making These Orders, Race Dynamics, Last Message
undefined
Jul 12, 2023 • 48min

Eric Michaud on scaling, grokking and quantum interpretability

Eric is a PhD student in the Department of Physics at MIT working with Max Tegmark on improving our scientific/theoretical understanding of deep learning -- understanding what deep neural networks do internally and why they work so well. This is part of a broader interest in the nature of intelligent systems, which previously led him to work with SETI astronomers, with Stuart Russell's AI alignment group (CHAI), and with Erik Hoel on a project related to integrated information theory. Transcript: https://theinsideview.ai/eric Youtube: https://youtu.be/BtHMIQs_5Nw The Quantization Model of Neural Scaling: https://arxiv.org/abs/2303.13506An Effective Theory of Representation Learning https://arxiv.org/abs/2205.10343 Omnigrok: Grokking Beyond Algorithmic Data: https://arxiv.org/abs/2210.01117

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode