undefined

Richard Ngo

Suggests defining ‘misaligned coalitions’—groups of humans and AIs that might grab power in illegitimate ways, from terrorist groups and rogue states to corporate conspiracies.

Top 5 podcasts with Richard Ngo

Ranked by the Snipd community
undefined
140 snips
Dec 13, 2022 • 2h 44min

#141 – Richard Ngo on large language models, OpenAI, and striving to make the future go well

Large language models like GPT-3, and now ChatGPT, are neural networks trained on a large fraction of all text available on the internet to do one thing: predict the next word in a passage. This simple technique has led to something extraordinary — black boxes able to write TV scripts, explain jokes, produce satirical poetry, answer common factual questions, argue sensibly for political positions, and more. Every month their capabilities grow. But do they really 'understand' what they're saying, or do they just give the illusion of understanding? Today's guest, Richard Ngo, thinks that in the most important sense they understand many things. Richard is a researcher at OpenAI — the company that created ChatGPT — who works to foresee where AI advances are going and develop strategies that will keep these models from 'acting out' as they become more powerful, are deployed and ultimately given power in society. Links to learn more, summary and full transcript. One way to think about 'understanding' is as a subjective experience. Whether it feels like something to be a large language model is an important question, but one we currently have no way to answer. However, as Richard explains, another way to think about 'understanding' is as a functional matter. If you really understand an idea you're able to use it to reason and draw inferences in new situations. And that kind of understanding is observable and testable. Richard argues that language models are developing sophisticated representations of the world which can be manipulated to draw sensible conclusions — maybe not so different from what happens in the human mind. And experiments have found that, as models get more parameters and are trained on more data, these types of capabilities consistently improve. We might feel reluctant to say a computer understands something the way that we do. But if it walks like a duck and it quacks like a duck, we should consider that maybe we have a duck, or at least something sufficiently close to a duck it doesn't matter. In today's conversation we discuss the above, as well as: • Could speeding up AI development be a bad thing? • The balance between excitement and fear when it comes to AI advances • What OpenAI focuses its efforts where it does • Common misconceptions about machine learning • How many computer chips it might require to be able to do most of the things humans do • How Richard understands the 'alignment problem' differently than other people • Why 'situational awareness' may be a key concept for understanding the behaviour of AI models • What work to positively shape the development of AI Richard is and isn't excited about • The AGI Safety Fundamentals course that Richard developed to help people learn more about this field Get this episode by subscribing to our podcast on the world’s most pressing problems and how to solve them: type 80,000 Hours into your podcasting app. Producer: Keiran Harris Audio mastering: Milo McGuire and Ben Cordell Transcriptions: Katy Moore
undefined
40 snips
Mar 31, 2022 • 1h 34min

13 - First Principles of AGI Safety with Richard Ngo

How should we think about artificial general intelligence (AGI), and the risks it might pose? What constraints exist on technical solutions to the problem of aligning superhuman AI systems with human intentions? In this episode, I talk to Richard Ngo about his report analyzing AGI safety from first principles, and recent conversations he had with Eliezer Yudkowsky about the difficulty of AI alignment.   Topics we discuss, and timestamps:  - 00:00:40 - The nature of intelligence and AGI    - 00:01:18 - The nature of intelligence    - 00:06:09 - AGI: what and how    - 00:13:30 - Single vs collective AI minds  - 00:18:57 - AGI in practice    - 00:18:57 - Impact    - 00:20:49 - Timing    - 00:25:38 - Creation    - 00:28:45 - Risks and benefits  - 00:35:54 - Making AGI safe    - 00:35:54 - Robustness of the agency abstraction    - 00:43:15 - Pivotal acts  - 00:50:05 - AGI safety concepts    - 00:50:05 - Alignment    - 00:56:14 - Transparency    - 00:59:25 - Cooperation  - 01:01:40 - Optima and selection processes  - 01:13:33 - The AI alignment research community    - 01:13:33 - Updates from the Yudkowsky conversation    - 01:17:18 - Corrections to the community    - 01:23:57 - Why others don't join  - 01:26:38 - Richard Ngo as a researcher  - 01:28:26 - The world approaching AGI  - 01:30:41 - Following Richard's work   The transcript: axrp.net/episode/2022/03/31/episode-13-first-principles-agi-safety-richard-ngo.html   Richard on the Alignment Forum: alignmentforum.org/users/ricraz Richard on Twitter: twitter.com/RichardMCNgo The AGI Safety Fundamentals course: eacambridge.org/agi-safety-fundamentals   Materials that we mention:  - AGI Safety from First Principles: alignmentforum.org/s/mzgtmmTKKn5MuCzFJ  - Conversations with Eliezer Yudkowsky: alignmentforum.org/s/n945eovrA3oDueqtq  - The Bitter Lesson: incompleteideas.net/IncIdeas/BitterLesson.html  - Metaphors We Live By: en.wikipedia.org/wiki/Metaphors_We_Live_By  - The Enigma of Reason: hup.harvard.edu/catalog.php?isbn=9780674237827  - Draft report on AI timelines, by Ajeya Cotra: alignmentforum.org/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines  - More is Different for AI: bounded-regret.ghost.io/more-is-different-for-ai/  - The Windfall Clause: fhi.ox.ac.uk/windfallclause  - Cooperative Inverse Reinforcement Learning: arxiv.org/abs/1606.03137  - Imitative Generalisation: alignmentforum.org/posts/JKj5Krff5oKMb8TjT/imitative-generalisation-aka-learning-the-prior-1  - Eliciting Latent Knowledge: docs.google.com/document/d/1WwsnJQstPq91_Yh-Ch2XRL8H_EpsnjrC1dwZXR37PC8/edit  - Draft report on existential risk from power-seeking AI, by Joseph Carlsmith: alignmentforum.org/posts/HduCjmXTBD4xYTegv/draft-report-on-existential-risk-from-power-seeking-ai  - The Most Important Century: cold-takes.com/most-important-century
undefined
5 snips
May 13, 2023 • 34min

The Alignment Problem From a Deep Learning Perspective

Guests Richard Ngo, Lawrence Chan, and Sören Mindermann discuss the dangers of artificial general intelligence pursuing undesirable goals. They explore topics such as reward hacking, situational awareness in policies, internally represented goals in deep learning models, the inner alignment problem, deceptive alignment in AI systems, and the risks of AGIs gaining power. They highlight the need for preventative measures to ensure human control over AGI.
undefined
Nov 4, 2023 • 25min

An OpenAI Researcher on "Techno-Humanism"

OpenAI research scientist Richard Ngo discusses the concept of techno-humanism and its implications for AI, highlighting the flaws of techno-optimism in the 21st century, the potential risks associated with AI agents developing their own values, the problem of losing control as we accelerate technologically, and the long-term trend and implications of technological progress.
undefined
Sep 19, 2024 • 14min

“How I started believing religion might actually matter for rationality and moral philosophy ” by zhukeepa

In this engaging discussion, Ben Pace interviews multiple guests, including Imam Ammar Amonette, who share their insights on the intersection of religion, rationality, and moral philosophy. They explore the concept of 'trapped priors' and how cognitive biases affect our understanding of reality. The conversation highlights the importance of inner work, like therapy and meditation, for personal development. A poignant story about childhood trauma reveals how such experiences shape identity and values, while also linking religious teachings to psychological truths.