Don't Worry About the Vase Podcast cover image

AI CoT Reasoning Is Often Unfaithful

Don't Worry About the Vase Podcast

00:00

Intro

This chapter explores new findings from Anthropic regarding the limitations of chain of thought reasoning models in accurately representing their reasoning processes. The discussion highlights significant discrepancies between expressed reasoning and actual output mechanisms, raising concerns about their reliability for AI safety monitoring.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app