AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is It Related to Inverse Reinforcement Learning?
The new method, which people call R L H F, really reinforcement learning from human feedback is like taking over. This seems somewhat related to inverse reinforcement learning where the idea is to learn the preferences of the human. I think we're increasingly realizing that if we want AI that people will actually use and engage with in the way that they do now, then social others provide you with dense feedback. Social others are hugely consequential for how we behave.