AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Challenges of Multi-Agent Learning
The problem of multi-agent learning is non-stationarity and equilibrium selection. N naive learning methods have a strong bias towards solutions that lead to radically bad outcomes for all agents in the environment, such as defecting unconditionally in all situations. To make matters worse, we get extremely hard credit assignment problems whereby suddenly the actions that an agent takes in an episode can change the data that enters the replay buffer or the training code of another agency. And one example of what is playing the iterated prisoner dilemma whereby there are a lot of different possible Nash equilibria that could be reached during training.