Multiarm Bandit

There are situations where a multi arm bandit is going to be preferable over an a b test. So depending on wht elgarithm you use, there may be different guarantees about convergence. The way thompson sampling works is it'll tend to select more damly in the early stages. And then as you go through the experiment and collect more evidence the the bandid will naturally tend to start preferringtha the ones that have appeared to be better moreen selecting them more often.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app