AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Different Problems of Voltage Learning in Multi-Age Learning
In multi-age learning, one of the grand challenges is what policy do you play? It all depends on what others are doing. If you have a fully cooperative setting where you're able to control the policies so the way that other agents act for every single player in the team, then that effectively reduces to a single agent problem. In two-player zero sum, I can specify the requirement of having a Nash equilibrium. And if I find any Nash equilibrium, then I'm guaranteed not to be beaten by any training partner test time. That's why sort of competition and fully cooperative self-play are special cases of voltage learning.