AXRP - the AI X-risk Research Podcast cover image

3 - Negotiable Reinforcement Learning with Andrew Critch

AXRP - the AI X-risk Research Podcast

00:00

Theorem of the Stock Market

In this theorem, you sort of assume that im palsies can be staccastic. E think it's slightly different than what readers might think. At every am, time step you can ranamize what you're going to do. The a i system is like flipping a coin and its palisy is just what the weight of that coin is. And it can also randamize at outset if it wants. It can choose a random seed at the beginning. Iought to choose between two different random policies. So it has a memory in sense. You could just a few diferet ways of formalizing it. Or you could just have that a it that i can remember

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app