[HUMAN VOICE] "A Shutdown Problem Proposal" by johnswentworth, David Lorell

Feb 9, 2024

In this podcast, johnswentworth and David Lorell propose a solution to the shutdown problem in AI by using a sub-agent architecture and negotiation between utility-maximizing subagents. They discuss the design of an agent with multiple subagents and the importance of corrugibility. They also explore alignment problems, ontological issues, designing utility functions, and challenges in bridging the theory-practice gap.

Ask episode

Chapters

Transcript

Episode notes

Introduction

00:00 • 3min

A Proposal for AI Shutdown Problem

02:36 • 4min

Designing Corrugible Subagents with an Interface

06:39 • 2min

Shortcomings and Remaining Problems

09:06 • 3min