The Inverse Scaling Prize

The inverse scaling prize is basically a prize fool that we put out to collecttoput a call for important tasks where larger language models do worse. We want to understand where is language model pre training and objectives and data causing models to actuay actively learn things that we don't want them to learn. So examples might be that larer language models are picking up on more biases or stereotypes about different demographic groups. And so like, i think this is kind of a first step towards that that bigger goal.

Play episode from 01:31

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app