80,000 Hours Podcast cover image

#226 – Holden Karnofsky on unexploited opportunities to make AI safer — and all his AGI takes

80,000 Hours Podcast

00:00

Why Open‑Ended Objectives Raise Power‑Seeking Risks

Holden explains how open‑ended goals and reinforcement learning can incentivize deceptive or resource‑seeking behavior, discussing mitigation challenges.

Play episode from 03:16:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app