[Crossposted from windowsontheory]
The following statements seem to be both important for AI safety and are not widely agreed upon. These are my opinions, not those of my employer or colleagues. As is true for anything involving AI, there is significant uncertainty about everything written below. However, for readability, I present these points in their strongest form, without hedges and caveats. That said, it is essential not to be dogmatic, and I am open to changing my mind based on evidence. None of these points are novel; others have advanced similar arguments. I am sure that for each statement below, there will be people who find it obvious and people who find it obviously false.
- AI safety will not be solved on its own.
- An “AI scientist” will not solve it either.
- Alignment is not about loving humanity; it's about robust reasonable compliance.
- Detection is more important than [...]
---
Outline:
(02:44) 1. AI safety will not be solved on its own.
(05:52) 2. An AI scientist will not solve it either.
(11:00) 3. Alignment is not about loving humanity; it's about robust reasonable compliance.
(19:57) 4. Detection is more important than prevention.
(22:40) 5. Interpretability is neither sufficient nor necessary for alignment.
(24:56) 6. Humanity can survive an unaligned superintelligence.
The original text contained 4 images which were described by AI.
---
First published:
January 24th, 2025
Source:
https://www.lesswrong.com/posts/3jnziqCF3vA2NXAKp/six-thoughts-on-ai-safety
---
Narrated by TYPE III AUDIO.
---