
Jacob Hilton
Author of the LessWrong post "A bird's eye view of ARC's research", offering insights into ARC's AI alignment research.
Best podcasts with Jacob Hilton
Ranked by the Snipd community

Oct 27, 2024 • 11min
“A bird’s eye view of ARC’s research” by Jacob_Hilton
In this discussion, Jacob Hilton, author and researcher at ARC, delves into the intricate world of AI intent alignment research. He paints a cohesive picture of how various pieces of ARC's research interconnect within a unified vision. Hilton emphasizes significant challenges and innovative methodologies, shedding light on theoretical frameworks that guide their efforts. He also highlights future research directions, making a compelling case for the relevance of ARC's work in the evolving landscape of AI alignment.

Jun 25, 2024 • 15min
AF - Formal verification, heuristic explanations and surprise accounting by Jacob Hilton
Jacob Hilton, ARC's current research focuses on combining mechanistic interpretability and formal verification. The podcast discusses formal verification and heuristic explanations for neural networks, exploring ARC's approach to safety and interpretability. They delve into challenges of formal verification for large neural networks and introduce heuristic explanations as an alternative approach. 'Surprise accounting' is discussed to assess the efficacy of heuristic explanations in understanding neural network properties.


