
Jordan Terry
TalkRL: The Reinforcement Learning Podcast
00:00
Hyperopti or Optuna Hyper Perimeter Tuning?
When you're training a reinforcement learning environment, what you do is that you go, and at some point of the like a series of reward values in return that one value to the black box. How do you integrate that in a way that doesn't screwe everything up, or variance, rightly? Yeor veres wants stability, yes. What in the problem of variancs this that m is that you have to run them a bunch, which is even more challenging. But yet likethis, is what i mean. And aside from its being foundational work, this easy foundational work has zero competition,. which sounds appealing. It's not a formally studied problem, as best as
Transcript
Play full episode