The Inside View cover image

Shahar Avin–Intelligence Rising, AI Governance

The Inside View

00:00

How to Red Team AI Systems

Red teaming is an adversarial relationship where you pretend to be the adversary and break whatever system or process you have in place. I think there's more and more adversarial examples when building self-driving cars. With the rising of language models, people can just test models like HP3 and find ways in which to make these behaviors. You might worry that when you get to very advanced systems, this method no longer works and you need different kinds of solutions.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app