Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Jan 4, 2025
32:19
forum Ask episode
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
This paper explains Anthropic’s constitutional AI approach, which is largely an extension on RLHF but with AIs replacing human demonstrators and human evaluators.