undefined

Neel Nanda

Final year maths undergraduate at the University of Cambridge and gold medalist in the International Mathematical Olympiad, active member in rationalist and effective altruism communities.

Top 5 podcasts with Neel Nanda

Ranked by the Snipd community
undefined
25 snips
Apr 19, 2020 • 57min

#9 – Neel Nanda on Effective Planning and Building Habits that Stick

Neel Nanda is a final year maths undergraduate at the University of Cambridge, and a gold medalist in the International Mathematical Olympiad. He teaches regularly – from revision lectures to a recent ‘public rationality’ workshop. Neel is also an active member in rationalist and effective altruism communities. In this episode we discuss How to view self-improvement and optimising your goals Forming good habits through the 'TAPs' technique How to build effective plans by using our 'inner simulator' and 'pre-hindsight' You can read more on this episode's accompanying write-up: hearthisidea.com/episodes/neel. You can also read Neel's teaching notes for his planning workshop here. If you have any feedback or suggestions for future guests, please get in touch through our website. Also, Neel has created an anonymous feedback form for this episode, and he would love to hear any of your thoughts! Please also consider leaving a review on Apple Podcasts or wherever you're listening to this; we're just starting out and it would really help listeners find us! If you want to support the show more directly, you can also buy us a beer at tips.pinecast.com/jar/hear-this-idea. Thanks for listening!
undefined
12 snips
Feb 16, 2023 • 1h 2min

Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability

Neel Nanda joins the podcast to talk about mechanistic interpretability and how it can make AI safer. Neel is an independent AI safety researcher. You can find his blog here: https://www.neelnanda.io Timestamps: 00:00 Introduction 00:46 How early is the field mechanistic interpretability? 03:12 Why should we care about mechanistic interpretability? 06:38 What are some successes in mechanistic interpretability? 16:29 How promising is mechanistic interpretability? 31:13 Is machine learning analogous to evolution? 32:58 How does mechanistic interpretability make AI safer? 36:54 36:54 Does mechanistic interpretability help us control AI? 39:57 Will AI models resist interpretation? 43:43 Is mechanistic interpretability fast enough? 54:10 Does mechanistic interpretability give us a general understanding? 57:44 How can you help with mechanistic interpretability? Social Media Links: ➡️ WEBSITE: https://futureoflife.org ➡️ TWITTER: https://twitter.com/FLIxrisk ➡️ INSTAGRAM: https://www.instagram.com/futureoflifeinstitute/ ➡️ META: https://www.facebook.com/futureoflifeinstitute ➡️ LINKEDIN: https://www.linkedin.com/company/future-of-life-institute/
undefined
11 snips
Feb 23, 2023 • 35min

Neel Nanda on Math, Tech Progress, Aging, Living up to Our Values, and Generative AI

Neel Nanda joins the podcast for a lightning round on mathematics, technological progress, aging, living up to our values, and generative AI. You can find his blog here: https://www.neelnanda.io Timestamps: 00:00 Introduction 00:55 How useful is advanced mathematics? 02:24 Will AI replace mathematicians? 03:28 What are the key drivers of tech progress? 04:13 What scientific discovery would disrupt Neel's worldview? 05:59 How should humanity view aging? 08:03 How can we live up to our values? 10:56 What can we learn from a person who lived 1.000 years ago? 12:05 What should we do after we have aligned AGI? 16:19 What important concept is often misunderstood? 17:22 What is the most impressive scientific discovery? 18:08 Are language models better learning tools that textbooks? 21:22 Should settling Mars be a priority for humanity? 22:44 How can we focus on our work? 24:04 Are human-AI relationships morally okay? 25:18 Are there aliens in the universe? 26:02 What are Neel's favourite books? 27:15 What is an overlooked positive aspect of humanity? 28:33 Should people spend more time prepping for disaster? 30:41 Neel's advice for teens. 31:55 How will generative AI evolve over the next five years? 32:56 How much can AIs achieve through a web browser? Social Media Links: ➡️ WEBSITE: https://futureoflife.org ➡️ TWITTER: https://twitter.com/FLIxrisk ➡️ INSTAGRAM: https://www.instagram.com/futureoflifeinstitute/ ➡️ META: https://www.facebook.com/futureoflifeinstitute ➡️ LINKEDIN: https://www.linkedin.com/company/future-of-life-institute/
undefined
4 snips
Sep 21, 2023 • 2h 5min

Neel Nanda on mechanistic interpretability, superposition and grokking

Neel Nanda, a researcher at Google DeepMind, discusses mechanistic interpretability in AI, induction heads in models, and his journey into alignment. He explores scalable oversight, the ambitious degree of interpretability in transformer architectures, and the capability of humans to understand complex models. The podcast also covers linear representations in neural networks, the concept of superposition in models and features, Terry Matt's mentorship program, and the importance of interpretability in AI systems.
undefined
Jan 20, 2024 • 41min

[HUMAN VOICE] "How useful is mechanistic interpretability?" by ryan_greenblatt, Neel Nanda, Buck, habryka

Neel Nanda, an expert in mechanistic interpretability, discusses the challenges and potential applications of mechanistic interpretability. They explore concrete projects, debunk the usefulness of mechanistic interpretability, and discuss the limitations in achieving interpretability in transformative models like GPT-4. They also delve into the concept of model safety and ablations, and discuss the potential of ruling out problematic behavior without fully understanding the model's internals. The speakers reflect on the dialogue and highlight its usefulness in advancing thinking about mechanistic interpretability.