The Inside View cover image

Ethan Perez–Inverse Scaling, Language Feedback, Red Teaming

The Inside View

00:00

Using a Language Model to Generate a Chatpot

Diversity is really important for this kind of reason, because you like, want to cover as many of the inputs, input space as you can. And so different methods are aike differently effective at doing that. One method we found that was pretty effective for doing that is just you generate with a language model lots of different inputs. So here, like, we were red teaming a model that basically acts like a chatpot. It's like a several page long prompt that cusit to generate text that acts like a kind ofk helpful, harmless, honest chatpot.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app