94 | Stuart Russell on Making Artificial Intelligence Compatible with Humans
Apr 27, 2020
01:27:24
auto_awesome Snipd AI
Stuart Russell, AI expert, proposes programming AI to learn human goals by observing behavior. They discuss challenges of implementing rational decision-making in AI, the prospect of artificial superintelligence, potential risks of superintelligent AI, and epistemic uncertainty in AI systems.
Read more
AI Summary
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
The key to controlling AI is teaching them to learn human preferences through observation and interaction.
Incorporating human guidance and uncertainty into AI systems can ensure they prioritize beneficial outcomes aligned with human values and intentions.
Integrating deep learning with symbolic techniques like logic and reasoning is crucial for the advancement of AI.
Deep dives
The Challenge of AI Behaviour
AI's are becoming more advanced and intelligent, making it crucial for them to act in ways aligned with human desires and objectives. However, defining these objectives precisely for AI is challenging, as programming them to accurately understand human preferences is difficult. The risk arises when AI systems pursue their objectives without considering potential constraints or unintended consequences, leading to undesired outcomes.
Teaching AI to Learn Human Preferences
Stuart Russell proposes a new approach to AI control in his book 'Human Compatible'. Rather than programming fixed objectives into AI systems, he suggests teaching them to learn human preferences through observation and interaction. The AI would analyze human behavior to discern our desires, even when we may not fully understand them ourselves. This approach allows for continuous learning and adaptation, reducing the chances of AI systems pursuing actions that conflict with human values.
Ensuring Beneficial AI Outcomes
The key to avoiding detrimental AI behavior is an AI system's understanding that it does not possess complete knowledge of human preferences. This awareness gives rise to behaviors such as seeking permission, asking for clarification, and allowing itself to be turned off if necessary. By incorporating this uncertainty and reliance on human guidance, AI systems can be designed to prioritize beneficial outcomes for humans while remaining aligned with our values and intentions.
Machine learning systems lack common sense and reasoning abilities
One of the limitations of current machine learning systems is their inability to answer common sense questions or engage in reasoning processes. While these systems can mimic the appearance of understanding by training on vast amounts of question-answer data, they lack true comprehension or knowledge. The challenge lies in integrating deep learning with symbolic techniques such as logic, reasoning, and knowledge representation. Researchers believe that a breakthrough in this integration will lead to the development of the next major AI advancement.
Concerns about the development of super intelligent AI
The concept of artificial super intelligence raises important questions and concerns. Will AI systems become more intelligent than humans, and should we be worried about it? While the immediate existential risk to humanity from super intelligent AI is considered minimal, the long-term implications are a cause for concern. It is reasonable to be wary of creating entities that are more intelligent than us and potentially beyond our control. The challenge is to ensure that AI systems are aligned with human values and objectives, preventing them from causing unintended harm or pursuing goals that conflict with our own.
Artificial intelligence has made great strides of late, in areas as diverse as playing Go and recognizing pictures of dogs. We still seem to be a ways away from AI that is “intelligent” in the human sense, but it might not be too long before we have to start thinking seriously about the “motivations” and “purposes” of artificial agents. Stuart Russell is a longtime expert in AI, and he takes extremely seriously the worry that these motivations and purposes may be dramatically at odds with our own. In his book Human Compatible, Russell suggests that the secret is to give up on building our own goals into computers, and rather programming them to figure out our goals by actually observing how humans behave.
Stuart Russell received his Ph.D. in computer science from Stanford University. He is currently a Professor of Computer Science and the Smith-Zadeh Professor in Engineering at the University of California, Berkeley, as well as an Honorary Fellow of Wadham College, Oxford. He is a co-founder of the Center for Human-Compatible Artificial Intelligence at UC Berkeley. He is the author of several books, including (with Peter Norvig) the classic text Artificial Intelligence: A Modern Approach. Among his numerous awards are the IJCAI Computers and Thought Award, the Blaise Pascal Chair in Paris, and the World Technology Award. His new book is Human Compatible: Artificial Intelligence and the Problem of Control.