LessWrong (Curated & Popular)

[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David Lorell

Feb 9, 2024
In this podcast, johnswentworth and David Lorell propose a solution to the shutdown problem in AI by using a sub-agent architecture and negotiation between utility-maximizing subagents. They discuss the design of an agent with multiple subagents and the importance of corrugibility. They also explore alignment problems, ontological issues, designing utility functions, and challenges in bridging the theory-practice gap.
Ask episode
Chapters
Transcript
Episode notes