Optimization Theory - Exploring the Search Space and Exploiting Full Time?

The Explore, Explore, Trade-Off and Optimization Theory is used in reinforcement learning. It says that if you've got a bounded amount of time over which you can optimize around a search space, your optimal strategy is to start off very exploratory then take advantage of what you have learned. Thompson Resampling is one of the easiest and best solutions to the multi-arm bandit problem. We evolve into being more exploitative as we get older.

Play episode from 39:21

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app