Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Jan 4, 2025
32:19
forum Ask episode
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
Audio versions of blogs and papers from BlueDot courses.
This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.