
A Manhattan Project for AI Safety (Robert Wright & Samuel Hammond)
Robert Wright's Nonzero
00:00
Representation of Facts in AI Systems and the Role of Interpretability
The chapter discusses the representation of facts in AI systems and the role of interpretability in addressing the alignment problem. It highlights a famous incident where incorrect information about the Eiffel Tower resulted in inaccurate answers and draws an analogy to neuroscience to emphasize the need for understanding how models function.
Transcript
Play full episode