
Reward Mismatches in RL Cause Emergent Misalignment
Don't Worry About the Vase Podcast
00:00
Generalizing X-Codings and Inoculation
Zvi discusses X-codings, how to undo them, and inoculation as a partial remedy for learned misbehavior.
Play episode from 01:06
Transcript


