undefined

Steve Byrnes

Neuroscience researcher whose work focuses on artificial general intelligence safety.

Top 3 podcasts with Steve Byrnes

Ranked by the Snipd community
undefined
33 snips
Oct 30, 2022 • 1h 31min

BI 151 Steve Byrnes: Brain-like AGI Safety

Support the show to get full episodes, full archive, and join the Discord community. Steve Byrnes is a physicist turned AGI safety researcher. He's concerned that when we create AGI, whenever and however that might happen, we run the risk of creating it in a less than perfectly safe way. AGI safety (AGI not doing something bad) is a wide net that encompasses AGI alignment (AGI doing what we want it to do). We discuss a host of ideas Steve writes about in his Intro to Brain-Like-AGI Safety blog series, which uses what he has learned about brains to address how we might safely make AGI. Steve's website.Twitter: @steve47285Intro to Brain-Like-AGI Safety.
undefined
Nov 23, 2024 • 1h 3min

“Neuroscience of human social instincts: a sketch” by Steven Byrnes

Steven Byrnes, a neuroscience researcher focusing on AGI safety, dives deep into the neurological underpinnings of human social instincts. He discusses how our innate sensory heuristics shape social interactions and decision-making. The podcast reveals the tension between compassion and spite in social dynamics and how this affects relationships. Byrnes also explores how involuntary attention impacts our perception and emotional responses, offering insights into empathy and aggression, all crucial for understanding both human and artificial intelligence.
undefined
Aug 7, 2024 • 23min

“Self-Other Overlap: A Neglected Approach to AI Alignment” by Marc Carauleanu, Mike Vaiana, Judd Rosenblatt, Diogo de Lucena

Join guests Bogdan Ionut-Cirstea, Steve Byrnes, Gunnar Zarnacke, Jack Foxabbott, and Seong Hah Cho, who contribute critical insights on AI alignment. They discuss an intriguing concept called self-other overlap, which aims to optimize AI models by aligning their reasoning about themselves and others. Early experiments suggest this technique can reduce deceptive behaviors in AI. With its scalable nature and minimal need for interpretability, self-other overlap could be a game-changer in creating pro-social AI.