
Episode 19: Minqi Jiang, UCL, on environment and curriculum design for general RL agents
Generally Intelligent
00:00
Refutation Based Learning Environments
The regret based work sotha was building offon e paper that introduced an algrthm called pert, which stands for protagonist antagonist induced regret environment design. And so the adversarial teacher and the antagonist are esentially on a team. If there exists an antagonist is getting higher returned than your student, it means the student can do better on this environment. That, i think, may be turned into your evolving curricula with regret based environment to signe.
Transcript
Play full episode