The Inside View cover image

David Bau on Editing Facts in GPT, AI Safety and Interpretability

The Inside View

00:00

Introduction

Machine learning is about learning black box models that achieve some sort of goal. In the past year, it was this like Rome paper that is making models more interpretable and maybe editing some knowledge inside of them. The basic idea is to understand the mechanisms that go on inside the black box. Understanding those internal features and those internal signals is what interpretability is all about.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app