TalkRL: The Reinforcement Learning Podcast cover image

Rohin Shah

TalkRL: The Reinforcement Learning Podcast

CHAPTER

Reward Functions in Reaper Learning

The minaral basalt competition is based on the minorala mine craft based aural environment. The idea here is that you consider some possible situations where the ai could do things and then ask a human, in these particular situations, what should the ai system do? So you're making more local queries and local specifications, rather than having to reason about every possible circumstance that can never arise. You can then train an agent to meet that specification as best as it can. Ah, so their job is to make an agent that actually does maximize performance and minimize costs of how it takes humans to learn from them. They can write down a rebord function by hand, if that seems

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner