TalkRL: The Reinforcement Learning Podcast cover image

Jordan Terry

TalkRL: The Reinforcement Learning Podcast

00:00

Hyperopti or Optuna Hyper Perimeter Tuning?

When you're training a reinforcement learning environment, what you do is that you go, and at some point of the like a series of reward values in return that one value to the black box. How do you integrate that in a way that doesn't screwe everything up, or variance, rightly? Yeor veres wants stability, yes. What in the problem of variancs this that m is that you have to run them a bunch, which is even more challenging. But yet likethis, is what i mean. And aside from its being foundational work, this easy foundational work has zero competition,. which sounds appealing. It's not a formally studied problem, as best as

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app