
ELK And The Problem Of Truthful AI
Astral Codex Ten Podcast
00:00
How Does the a I Get Good Ratings?
Scott tons it the training process. We check whether the diamond is safe or not, then we rate it as good or bad. The a i gradient descends away from bad strategies towards good ones. Eventually, we've trained the a i very well, and it has an apparent 100 percent success rate. What could go if we're very paranoid? Scott says you can never be fully sure your information channels aren't hacked.
Transcript
Play full episode