The Inside View cover image

Ethan Perez–Inverse Scaling, Language Feedback, Red Teaming

The Inside View

00:00

The Inverse Scaling Prize

The inverse scaling prize is basically a prize fool that we put out to collecttoput a call for important tasks where larger language models do worse. We want to understand where is language model pre training and objectives and data causing models to actuay actively learn things that we don't want them to learn. So examples might be that larer language models are picking up on more biases or stereotypes about different demographic groups. And so like, i think this is kind of a first step towards that that bigger goal.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app