
"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland
LessWrong (Curated & Popular)
00:00
Is Chain of Thought Deception a Problem ByDefault?
This seems like it gets us a little further, because we can now apply some r l, which might cause deception that is now catchable. One problem identified by john wentworth with this strategy is that the a i might cause large bad sihde effects, like wiping out humans without intent. This seems like a problem, but this project still is clear progress toward solving alignment eli's opinion,. Though significantly more promising on shorter time lines, where something like current llm's scale to a g is future of humanity.
Play episode from 49:44
Transcript


