
The Alignment Problem From a Deep Learning Perspective
AI Safety Fundamentals: Alignment
Internally Represented Goals in Deep Learning Models
This chapter explores the concept of internally represented goals in deep learning models, discussing both model-based and model-free policies. It argues that as AI architectures become more expressive and policies generalize outside their training distributions, internally-represented goals will become more extensive, including broadly scoped goals that apply to long-time frames and wide ranges of tasks.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.