The Inside View cover image

Ethan Perez–Inverse Scaling, Language Feedback, Red Teaming

The Inside View

00:00

Red Teaming Language Models

i think you're the leading author on a paper at dipmi, from dip mine. Red timing als red teming language models with language models. So what is red timing? And why are you the leading author of this paper? Ya, m, i, so red teaming as the process is a basly finding cases where models fail to produce the behavior that you wanted to.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app