LessWrong (Curated & Popular) cover image

[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David Lorell

LessWrong (Curated & Popular)

00:00

Designing Corrugible Subagents with an Interface

This chapter explores the design of an agent made up of multiple subagents with different utilities and counterfactuals, emphasizing the importance of corrugibility and the use of a shutdown button as an interface to control the agent. The chapter also discusses counterfactoring instructions and the behavior of subsystems in a market-like manner.

Play episode from 06:39
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app