

Davidad Dalrymple: Towards Provably Safe AI
Sep 5, 2024
Davidad Dalrymple, Programme Director at ARIA and former AI safety researcher at Oxford, dives into the complexities of artificial intelligence. He discusses the shifting perspectives on AI risk, the significance of calibrated predictions for AGI, and the orthogonality thesis. Davidad emphasizes accountability in AI development, highlighting the need for robust safety frameworks. He also explores the interconnected challenges of cyber-physical systems and the importance of diverse research approaches to navigate potential AGI pitfalls. Tune in for insightful reflections on the future of AI!
AI Snips
Chapters
Transcript
Episode notes
Learning Calibration from Experience
- Davidad Dalrymple once predicted a simulation of a worm's nervous system by 2020 but it did not happen.
- He improved calibration by training himself to assign probabilities and learning from outcomes over time.
Navigating Radical AI Uncertainty
- Dalrymple's expected AGI arrival median is around 2030 with broad uncertainty.
- He pursues a portfolio approach to research to handle the radical uncertainty about timelines and safety outcomes.
Orthogonality Thesis Reconsidered
- Dalrymple shifted from rejecting to seriously considering the orthogonality thesis due to arguments and deep learning progress.
- AI can achieve strong capabilities without human-like goal reflection or moral reasoning.