Don't Worry About the Vase Podcast cover image

AI CoT Reasoning Is Often Unfaithful

Don't Worry About the Vase Podcast

00:00

Evaluating AI Reasoning Models

This chapter explores the assessment of faithfulness in AI reasoning models, particularly through their responses to hint-based evaluations. It reveals inconsistencies in model answers when presented with hints, discussing phenomena such as reward hacking and the implications for transparency in AI reasoning.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app