AXRP - the AI X-risk Research Podcast cover image

3 - Negotiable Reinforcement Learning with Andrew Critch

AXRP - the AI X-risk Research Podcast

CHAPTER

Theorem of the Stock Market

In this theorem, you sort of assume that im palsies can be staccastic. E think it's slightly different than what readers might think. At every am, time step you can ranamize what you're going to do. The a i system is like flipping a coin and its palisy is just what the weight of that coin is. And it can also randamize at outset if it wants. It can choose a random seed at the beginning. Iought to choose between two different random policies. So it has a memory in sense. You could just a few diferet ways of formalizing it. Or you could just have that a it that i can remember

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner