AI Safety Fundamentals: Alignment

Introduction to Mechanistic Interpretability

Jan 2, 2025

11:45

forum

Ask episode

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

Our introduction introduces common mech interp concepts, to prepare you for the rest of this session's resources.

Original text: https://aisafetyfundamentals.com/blog/introduction-to-mechanistic-interpretability/

Author(s): Sarah Hastings-Woodhouse

A podcast by BlueDot Impact.

Learn more on the AI Safety Fundamentals website.

Home Top podcasts Popular guests Top books