The Inside View cover image

Collin Burns On Discovering Latent Knowledge In Language Models Without Supervision

The Inside View

00:00

Introduction

Colin Rutskowski is a second year ML PhD at Berkeley working with Jacob Thinnard and Dan Klein. His focus is on making language models honest, interpretable and aligned. He once broke the official world record for solving a Rips cube in five seconds. And we're going to be talking a lot about this paper today.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app