undefined

Jan Leike

Head of Alignment at OpenAI, co-leading the Superalignment project focused on making superintelligent AI systems safe.

Top 5 podcasts with Jan Leike

Ranked by the Snipd community
undefined
75 snips
Aug 7, 2023 • 2h 51min

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

Jan Leike, Head of Alignment at OpenAI and leader of the Superalignment project, discusses the ambitious goal of safely developing superintelligent AI within four years. He addresses the challenges of aligning AI with human values and the importance of Reinforcement Learning from Human Feedback (RLHF). Leike expresses guarded optimism about finding solutions to steer AI safely, emphasizing collaboration and innovative approaches in tackling these complex issues. The conversation also highlights recruitment efforts to build a team for this critical initiative.
undefined
23 snips
Jul 27, 2023 • 2h 8min

24 - Superalignment with Jan Leike

Recently, OpenAI made a splash by announcing a new "Superalignment" team. Lead by Jan Leike and Ilya Sutskever, the team would consist of top researchers, attempting to solve alignment for superintelligent AIs in four years by figuring out how to build a trustworthy human-level AI alignment researcher, and then using it to solve the rest of the problem. But what does this plan actually involve? In this episode, I talk to Jan Leike about the plan and the challenges it faces. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast Episode art by Hamish Doodles: hamishdoodles.com/   Topics we discuss, and timestamps:  - 0:00:37 - The superalignment team  - 0:02:10 - What's a human-level automated alignment researcher?    - 0:06:59 - The gap between human-level automated alignment researchers and superintelligence    - 0:18:39 - What does it do?    - 0:24:13 - Recursive self-improvement  - 0:26:14 - How to make the AI AI alignment researcher    - 0:30:09 - Scalable oversight    - 0:44:38 - Searching for bad behaviors and internals    - 0:54:14 - Deliberately training misaligned models  - 1:02:34 - Four year deadline    - 1:07:06 - What if it takes longer?  - 1:11:38 - The superalignment team and...    - 1:11:38 - ... governance    - 1:14:37 - ... other OpenAI teams    - 1:18:17 - ... other labs  - 1:26:10 - Superalignment team logistics  - 1:29:17 - Generalization  - 1:43:44 - Complementary research  - 1:48:29 - Why is Jan optimistic?    - 1:58:32 - Long-term agency in LLMs?    - 2:02:44 - Do LLMs understand alignment?  - 2:06:01 - Following Jan's research   The transcript: axrp.net/episode/2023/07/27/episode-24-superalignment-jan-leike.html   Links for Jan and OpenAI:  - OpenAI jobs: openai.com/careers  - Jan's substack: aligned.substack.com  - Jan's twitter: twitter.com/janleike   Links to research and other writings we discuss:  - Introducing Superalignment: openai.com/blog/introducing-superalignment  - Let's Verify Step by Step (process-based feedback on math): arxiv.org/abs/2305.20050  - Planning for AGI and beyond: openai.com/blog/planning-for-agi-and-beyond  - Self-critiquing models for assisting human evaluators: arxiv.org/abs/2206.05802  - An Interpretability Illusion for BERT: arxiv.org/abs/2104.07143  - Language models can explain neurons in language models https://openaipublic.blob.core.windows.net/neuron-explainer/paper/index.html  - Our approach to alignment research: openai.com/blog/our-approach-to-alignment-research  - Training language models to follow instructions with human feedback (aka the Instruct-GPT paper): arxiv.org/abs/2203.02155
undefined
22 snips
Sep 29, 2021 • 1h 5min

96. Jan Leike - AI alignment at OpenAI

The more powerful our AIs become, the more we’ll have to ensure that they’re doing exactly what we want. If we don’t, we risk building AIs that use dangerously creative solutions that have side-effects that could be undesirable, or downright dangerous. Even a slight misalignment between the motives of a sufficiently advanced AI and human values could be hazardous. That’s why leading AI labs like OpenAI are already investing significant resources into AI alignment research. Understanding that research is important if you want to understand where advanced AI systems might be headed, and what challenges we might encounter as AI capabilities continue to grow — and that’s what this episode of the podcast is all about. My guest today is Jan Leike, head of AI alignment at OpenAI, and an alumnus of DeepMind and the Future of Humanity Institute. As someone who works directly with some of the world’s largest AI systems (including OpenAI’s GPT-3) Jan has a unique and interesting perspective to offer both on the current challenges facing alignment researchers, and the most promising future directions the field might take. ---  Intro music: ➞ Artist: Ron Gelinas ➞ Track Title: Daybreak Chill Blend (original mix) ➞ Link to Track: https://youtu.be/d8Y2sKIgFWc ---  Chapters:   0:00 Intro 1:35 Jan’s background 7:10 Timing of scalable solutions 16:30 Recursive reward modeling 24:30 Amplification of misalignment 31:00 Community focus 32:55 Wireheading 41:30 Arguments against the democratization of AIs 49:30 Differences between capabilities and alignment 51:15 Research to focus on 1:01:45 Formalizing an understanding of personal experience 1:04:04 OpenAI hiring 1:05:02 Wrap-up
undefined
19 snips
Aug 20, 2019 • 33min

AI, Robot

Forget what sci-fi has told you about superintelligent robots that are uncannily human-like; the reality is more prosaic. Inside DeepMind’s robotics laboratory, Hannah explores what researchers call ‘embodied AI’: robot arms that are learning tasks like picking up plastic bricks, which humans find comparatively easy. Discover the cutting-edge challenges of bringing AI and robotics together, and learning from scratch how to perform tasks. She also explores some of the key questions about using AI safely in the real world.If you have a question or feedback on the series, message us on Twitter (@DeepMind using the hashtag #DMpodcast) or email us at podcast@deepmind.com.Further reading:Blogs on AI safety and further resources from Victoria KrakovnaThe Future of Life Institute: The risks and benefits of AIThe Wall Street Journal: Protecting Against AI’s Existential ThreatTED Talks: Max Tegmark - How to get empowered, not overpowered, by AIRoyal Society lecture series sponsored by DeepMind: You & AINick Bostrom: Superintelligence: Paths, Dangers and Strategies (book)OpenAI: Learning from Human PreferencesDeepMind blog: Learning from human preferencesDeepMind blog: Learning by playing - how robots can tidy up after themselvesDeepMind blog: AI safetyInterviewees: Software engineer Jackie Kay and research scientists Murray Shanahan, Victoria Krakovna, Raia Hadsell and Jan Leike.Credits:Presenter: Hannah FryEditor: David PrestSenior Producer: Louisa FieldProducers: Amy Racs, Dan HardoonBinaural Sound: Lucinda Mason-BrownMusic composition: Eleni Shaw (with help from Sander Dieleman and WaveNet)Commissioned by DeepMind Please like and subscribe on your preferred podcast platform. Want to share feedback? Or have a suggestion for a guest that we should have on next? Leave us a comment on YouTube and stay tuned for future episodes.  
undefined
6 snips
Mar 18, 2025 • 13min

OpenAI just had a BLACK SWAN "iPhone Moment" with GPT-4o - Here's what that means for Google...

Ilya Sutskever, co-founder of OpenAI and a leading AI researcher, joins Jan Leike, an expert in AI safety, to discuss the groundbreaking implications of GPT-4, likening it to an 'iPhone moment' for the industry. They dive into how this technology could reshape traditional business models, especially in search. The duo also teases their upcoming book, 'Heavy Silver', and reflects on the ethical responsibilities of AI development. Their insights highlight both the excitement and challenges of navigating this new AI frontier.