Using the Relative Performance of Each Agent

Using the relative performance of each with each. So maybe what you've want to see is a fair amount of difference in payoff between different pairs. And that's kind of the like MCC approach. The criteria curve solution where you basically want each agent early days able to solve a single environment and each environment ready to be solved by a single agent. But I think in the original, the original power, they didn't have their like rankings part, which was extended into the second. She know also thinking the original power they had explicit novel to search based on the encoding to the environment. That makes sense.

Play episode from 41:46

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app