
AI CoT Reasoning Is Often Unfaithful
Don't Worry About the Vase Podcast
00:00
Exploring the Limits of Faithfulness in AI Reasoning
This chapter explores the intricate relationship between faithfulness in reasoning and reinforcement learning, illustrated through graphical analysis. By comparing two models, it reveals the limitations of RL in enhancing reasoning fidelity amidst the complexities of AI training.
Transcript
Play full episode