
Discovering AI Risks with AIs | Ethan Perez | EAG Bay Area 23
EAG Talks
00:00
Exploring Training Schemes and Model Behavior
Examining the impact of different training schemes on AI models' behavior, including word prediction and reinforcement learning. The chapter also explores the mixed signals and inconsistency in models trained with RL and raises the question of whether it's desirable for models to exhibit a desire to be shut down.
Transcript
Play full episode