LessWrong (Curated & Popular)

"Discussion with Nate Soares on a key alignment difficulty" by Holden Karnofsky

Apr 5, 2023

Ask episode

Chapters

Transcript

Episode notes

How an AI Avoids Pewter and POUDA

Holden's Hypothetical AI Training Approach

The Dangers of Needle Moving Alignment Research

The Importance of Reflection in AI Research

The Dangers of Mechanistic AI Training

How AI's Can Help You Achieve Your Goals

The Importance of Random Goals in AI Training

The Implications of Holden's Training Setup

The Importance of Robust Puda Avoidance

The Game of Confidence

The Doom of AI Training

How Creative Intellectual Work Works

The Future of Alignment Research

AI-Driven Research on the Alignment Problem