3min chapter

Machine Learning Street Talk (MLST) cover image

#92 - SARA HOOKER - Fairness, Interpretability, Language Models

Machine Learning Street Talk (MLST)

CHAPTER

Is RLHF Robustifying or Simplification?

People have cited RLHF as the biggest success of reinforcement learning which to me is a damning indictment of reinforcement learning. So they say that RLHF does a couple of things so it robustification in the sense of like you know given you'll get the same answer and the answer will be more aligned with human preferences but people have said oh the models are less creative, I'm not quite sure using my machine learning lens whether you could think of that as robustification. There's other ways that you could achieve that one is by just subset selection and spending more time and annotation you probably implicitly end up guiding in certain preferences that are more aligned with our ownso Sockley RL is the trick here

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode