
"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland
LessWrong (Curated & Popular)
00:00
How to Oversead Logical Reasoning
An a g i could be more capable by doing some reasoning not in english. This scheme requires not too much staginography. The model can't hide its reasoning in non english logical thought. A key question for this researchegender is thus how to put pressure to keep the reasoning externalized, instead of collapsing into internal slush logical reasoning that is harder for us to oversee. So here's a diagram showing three circles arranged in roghly triangle shape, with l l m output and external reasoning written inside the circles. And then we have arrows that point in different directions, showing the pressures that are exerted between these different things. That was the s s l arrow pointing out of the
Play episode from 46:39
Transcript


