LessWrong (Curated & Popular) cover image

“Optimizing The Final Output Can Obfuscate CoT (Research Note)” by lukemarks, jacob_drori, cloud, TurnTrout

LessWrong (Curated & Popular)

00:00

Intro

This chapter delves into research findings on the effects of penalizing language model outputs and the resulting obfuscation of reasoning processes. It also addresses feedback spillover and proposes using distinct models for reasoning and output generation to overcome these issues.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app