Super Data Science: ML & AI Podcast with Jon Krohn cover image

Super Data Science: ML & AI Podcast with Jon Krohn

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

Jun 11, 2024
57:10

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education.


This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.


In this episode you will learn:

• Why it is important that AI is open [03:13]

• The efficacy and scalability of direct preference optimization [07:32]

• Robotics and LLMs [14:32]

• The challenges to aligning reward models with human preferences [23:00]

• How to make sure AI’s decision making on preferences reflect desirable behavior [28:52]

• Why Nathan believes AI is closer to alchemy than science [37:38]


Additional materials: www.superdatascience.com/791

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode