
Ian Osband
TalkRL: The Reinforcement Learning Podcast
Balancing statistical and computational complexity in decision-making
In decision-making, one approach is to use estimates or point values for Q values and select the best policy greedily. However, being overly greedy can lead to suboptimal results, so introducing randomness through actions like dithering is essential. Techniques like Epsilon-greedy and Boltzmann exploration are used to balance between statistical and computational complexity but could potentially lead to long learning times. The challenge lies in trading off statistical accuracy with computational feasibility, with approaches like Thompson sampling focusing on randomly selecting actions based on probabilities.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.