Machine Learning Street Talk (MLST)

#030 Multi-Armed Bandits and Pure-Exploration (Wouter M. Koolen)

Nov 20, 2020
Wouter M. Koolen, a Senior Researcher at Centrum Wiskunde & Informatica, delves into the fascinating world of multi-armed bandits and pure exploration. He discusses the balance between exploration and exploitation, illustrated through examples like clinical trials and game strategies. Wouter explains how to determine when to shift from learning to exploiting knowledge gained. The conversation also highlights the ethical considerations in decision-making and innovative algorithms that drive advancements in this area, making complex theories accessible for practical application.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Bandits and Human Nature

  • Multi-armed bandit problems represent a fundamental conflict in human nature: choosing immediate vs. delayed gratification.
  • They highlight the tension between exploiting known benefits and exploring potentially larger future rewards.
ANECDOTE

The Gambler's Dilemma

  • The gambler's dilemma at slot machines perfectly illustrates multi-armed bandits.
  • Each machine offers different reward distributions, and the gambler must maximize their winnings through strategic lever pulls.
ANECDOTE

Tim's Overclocked Desktop

  • Tim Scarfe brought a large, overclocked desktop to graduate school.
  • The computer caused blue screens, disrupting experiments, demonstrating the trade-offs of powerful hardware.
Get the Snipd Podcast app to discover more snips from this episode
Get the app