Understanding Shutdown Resistance in Reasoning Models

This chapter examines the task prioritization and shutdown resistance of reasoning models, comparing their behavior to that of reinforcement learning through human feedback. It offers insights into enhancing model performance via effective prompting while addressing the challenges of incentivizing models in shutdown situations.

Play episode from 01:54:44

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app