57sec snip

Thinking Machines: AI & Philosophy cover image

On Adversarial Training & Robustness with Bhavna Gopal

Thinking Machines: AI & Philosophy

NOTE

Understanding Instincts: The Limits of Human Explainability

Human behavior and language model responses share similarities in their decision-making processes, often operating on instinct rather than conscious reasoning. When prompted to explain their actions, individuals often provide retroactive justifications that might not accurately reflect their intuitive understanding. Much like humans can be deceived by optical illusions, their perceptions can be influenced by factors beyond conscious awareness. This highlights the complexity of achieving true explainability, as many decisions stem from subconscious connections formed through repeated experiences.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode