Intro

This chapter delves into research findings on the effects of penalizing language model outputs and the resulting obfuscation of reasoning processes. It also addresses feedback spillover and proposes using distinct models for reasoning and output generation to overcome these issues.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app