
#367 – Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI
Lex Fridman Podcast
The Usability of Chat GPT: The Magic Ingredient of RLHF
Chat GPT is a powerful model that becomes more usable with reinforcement learning and human feedback. The model learns from text data and can do amazing things, but initially it's not easy to use. RLHF, or reinforcement learning with human feedback, is the magic ingredient that aligns the model with human preferences. By showing two outputs and asking for feedback, the model improves with remarkably little data.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.