AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Optimization Theory - Exploring the Search Space and Exploiting Full Time?
The Explore, Explore, Trade-Off and Optimization Theory is used in reinforcement learning. It says that if you've got a bounded amount of time over which you can optimize around a search space, your optimal strategy is to start off very exploratory then take advantage of what you have learned. Thompson Resampling is one of the easiest and best solutions to the multi-arm bandit problem. We evolve into being more exploitative as we get older.