LessWrong (Curated & Popular) cover image

"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland

LessWrong (Curated & Popular)

00:00

How to Oversead Logical Reasoning

An a g i could be more capable by doing some reasoning not in english. This scheme requires not too much staginography. The model can't hide its reasoning in non english logical thought. A key question for this researchegender is thus how to put pressure to keep the reasoning externalized, instead of collapsing into internal slush logical reasoning that is harder for us to oversee. So here's a diagram showing three circles arranged in roghly triangle shape, with l l m output and external reasoning written inside the circles. And then we have arrows that point in different directions, showing the pressures that are exerted between these different things. That was the s s l arrow pointing out of the

Play episode from 46:39
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app