
#131 Toby Ord - Will AI Destroy Humanity?
Within Reason
00:00
How RL from human feedback injects agency
Toby describes RL from human feedback and how dialog training makes models more goal-directed.
Play episode from 09:31
Transcript

Toby describes RL from human feedback and how dialog training makes models more goal-directed.