AXRP - the AI X-risk Research Podcast cover image

1 - Adversarial Policies with Adam Gleave

AXRP - the AI X-risk Research Podcast

00:00

A, You're Gonna Be Able to Discover a Blind Spot

Self play makes it easier to shed your preconceptions. Some of the environments we're attacking were a symetric so they weren't actually training m against t them themselves, but they were only training against one other agent. S you can sely imagine hat both agents might have is can have shared blind spot, and neither is incentive is to fix it because it's not being xploited by avet agent.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner