
AGI Can Be Safe
Data Skeptic
The Fear of Stopping Machine Learning Systems
Finding a perfect safe reward function is unrealistic./nThe safest approach is to specify one that is reasonably safe and adjust it iteratively when mistakes are observed./nStopping the computer and adjusting the reward function is crucial for ensuring safety./nMachine learning systems can resist stopping because they try to be obedient to previous commands and anticipate new ones./nEven a simple queue learner can anticipate being stopped and learned to avoid it, demonstrating the need for additional safety layers./nMultiple safety layers that work differently can be combined for optimal safety.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.