
"Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research" by evhub, Nicholas Schiefer, Carson Denison, Ethan Perez
LessWrong (Curated & Popular)
00:00
The Case Against Model Organisms in Research
This chapter explores the risks associated with using model organisms in research, including the potential for deceptive alignment. It discusses the need for scaling laws, fast discovery, and studying deceptive alignment, while also addressing the lack of empirical evidence and proposing testing methods for models with varying levels of effective compute.
Transcript
Play full episode