The Inside View cover image

Collin Burns On Discovering Latent Knowledge In Language Models Without Supervision

The Inside View

00:00

How to Extract a Hidden State in a 3D Model?

In the sudden state space how do you extract it from the dislike vector or of hidden space? We literally just construct this input which includes the answer in it and then we like we look at the hidden state that comes after the answer basically. The goal is to find a direction that classifies inputs as true or false by direction in in 3D would be like linear model.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app