
Alexander Pan on the MACHIAVELLI benchmark
The Inside View
00:00
Introduction
Exploring the Machiavelli benchmark paper evaluating power seeking and deception in language model agents, with a focus on realistic testing environments and tracking deceptive behavior instances.
Transcript
Play full episode