The speakers discuss the use of RLHF to train a pet robot and the historical context of cybernetics during World War II.
00:00
Transcript
Episode notes
In this episode, Tom gives us a lesson on all things feedback, mostly where our scientific framings of it came from. Together, we link this to RLHF, our previous work in RL, and how we were thinking about agentic ML systems before it was cool. Join us, on another great blast from the past on The Retort! We also have brought you video this week!
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode