AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Partial observability can lead to misalignment between humans' beliefs about AI behavior and the true state of the world. In scenarios where humans lack complete information, issues like deceptive inflation and overjustification can arise in human evaluation of AI actions. These issues are not limited to RLHF processes but are intrinsic to situations where human understanding is partial and open to interpretation.