AXRP - the AI X-risk Research Podcast cover image

8 - Assistance Games with Dylan Hadfield-Menell

AXRP - the AI X-risk Research Podcast

00:00

The Assistive Multi Armed Bandit

Lawrence chann was the first author on a paper that's called the assistive multi armed bandit. In this case we look at an assistance game where the person doesn't observe theta at the start of the game, but in fact, they learn about it overtime through reward signals. At there are some really interesting things we identify about ways that the system can help you learn before learning what you want.

Play episode from 44:23
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app