3min snip

Lex Fridman Podcast cover image

#367 – Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI

Lex Fridman Podcast

NOTE

How Reinforcement Learning with Human Feedback (RLHF) Makes ChatGPT Better

RLHF is a process in which human feedback is used to align a language model to what humans want it to do. This process makes the model more useful and easier to use, improving its capability to understand and provide desired responses. It requires remarkably little data and human supervision. The addition of human guidance through RLHF creates a feeling of alignment between the model and the user, making it feel like the model is trying to help. The science of human guidance is an important area of research, focusing on making language models more usable, wise, ethical, and aligned with human values.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode