
Tony Wang on Beating Superhuman Go AIs with Advesarial Policies
The Inside View
00:00
Explode AlphaGo in 2016
Since 2016, you were like, I wanted to explode AlphaGo. So we attacked Cotego, which is a more modern version of the Alpha zero style system. But how we found the exploit against it is we actually trained another go AI to kind of beat a victim Cotego system. And the training algorithm we used was actually derived from Alpha zero, which was the thing used to train Cotego itself.
Transcript
Play full episode