
Alexander Pan on the MACHIAVELLI benchmark
The Inside View
00:00
Exploring Moral Measurement in AI Agents
Delving into the definition of immoral actions, creating artificial conscience with language models, and discussing the implications of maximizing rewards without considering morality. Focus on using language models for ethical prompts and creating a benchmark for evaluating agent behavior in future directions for MACHIAVELLI.
Transcript
Play full episode