David Silver, an Original DeepMind researcher renowned for his work on AlphaGo, shares his bold vision for AI's future. He discusses how AI should evolve beyond reliance on human data, advocating for self-learning systems capable of generating their own experiences. Silver reflects on significant moments like move 37 in Go, illustrating AI's potential for creativity. He also introduces AlphaProof, a groundbreaking system for AI-driven mathematical proofs, emphasizing the importance of trust and innovative approaches in the AI landscape.
49:38
forum Ask episode
web_stories AI Snips
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
insights INSIGHT
Era of Experience
David Silver introduces the "era of experience" in AI, contrasting it with the current "era of human data".
He argues that AI needs to move beyond human knowledge to discover new things.
question_answer ANECDOTE
AlphaGo's Breakthrough
AlphaGo, using no human data (AlphaZero), became the strongest Go program.
It learned through trial and error, discovering novel strategies like "Move 37".
insights INSIGHT
LLMs Lack of Breakthroughs
David Silver suggests that large language models (LLMs) haven't had a "Move 37" moment yet.
He believes this is due to their reliance on human data, limiting their creativity.
Get the Snipd Podcast app to discover more snips from this episode
In this episode of Google DeepMind: The Podcast, VP of Reinforcement Learning, David Silver, describes his vision for the future of AI, exploring the concept of the "era of experience" versus the current "era of human data". Using AlphaGo and AlphaZero as examples, he highlights how these systems surpassed human capabilities by engaging in reinforcement learning without prior human knowledge. This approach contrasts with large language models, which depend on human data and feedback. Silver emphasizes the need to explore this path to drive AI progress and achieve artificial superintelligence.
Timestamps
00:00 Introduction
01:50 Era of experience
03:45 AlphaZero
10:19 Move 37
15:20 Reinforcement learning and human feedback
24:30 AlphaProof
29:50 Math Olympiads
35:00 Experience based methods
42:56 Hannah's reflections
44:00 Fan Hui joins
___
Thanks to everyone who made this possible, including but not limited to:
Presenter: Professor Hannah Fry
Series Producer: Dan Hardoon
Series Editor: Rami Tzabar
Commissioner & Producer: Emma Yousif
Music Composition: Eleni Shaw
Audio Engineer: Richard Courtice
Production Manager: Dan Lazard
Video Director and Editor: Bernardo Resende
Video Studio Production: Nicholas Duke
Video Editor: Bilal Merhi
Audio Engineer: Perry Rogantin
Camera and Lighting Operator: Robert Messere
Production Coordination: Zoey Roberts, Sarah Ellen Morton
Visual Identity and Design: Rob Ashley
Commissioned by Google DeepMind
Please leave us a review on Spotify or Apple Podcasts if you enjoyed this episode. We always want to hear from our audience whether that's in the form of feedback, new idea or a guest recommendation!
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.