The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

00:00

Interpreting AI Model Decisions

The reasoning behind AI model decisions may not be based on truth or internal world models, but rather on next token prediction, aligning with information in a layer without a fundamental connection. Different frames may lead to different interpretations of AI model explanations, with one suggesting they generate human-like text and another suggesting they represent what is true or false.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app