Scheming AIs | Joe Carlsmith | EA Global Bay Area 2024

Mar 6, 2024

51:52

forum

Ask episode

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

This talk examines whether advanced AIs that perform well in training will be doing so in order to gain power later — a behavior Joe Carlsmith calls "scheming" (also often called "deceptive alignment"). This talk gives an overview of his recent report on the topic, available on arXiv here: https://arxiv.org/abs/2311.08379. Joe Carlsmith is a senior research analyst at Open Philanthropy, where he focuses on existential risk from advanced artificial intelligence. He also writes independently about various topics in philosophy and futurism, and he has a doctorate in philosophy from the University of Oxford.

Watch on Youtube: https://www.youtube.com/watch?v=AxUTiGS6BHM

Home Top podcasts Popular guests Top books