LessWrong (Curated & Popular)

“Formal verification, heuristic explanations and surprise accounting” by paulfchristiano

Jun 27, 2024
The podcast discusses formal verification and heuristic explanations in neural networks, aiming to improve interpretability and ensure safe behavior. It explores the challenges of proving guarantees for network behavior and introduces surprise accounting as a method to evaluate heuristic explanations.
Ask episode
Chapters
Transcript
Episode notes