#8 Uri Hasson: Language in the real world for brains and AI
Dec 13, 2023
auto_awesome
Neuroscientist Uri Hasson discusses the neural basis of natural language acquisition and processing. Topics include temporal receptive windows, Wittgenstein, evolution, ChatGPT, transformers, multimodal integration, episodic memory, interactive sociality, and understanding in neuroscience/AI.
Understanding behavior can be achieved through surface-level observations and statistical analysis, rather than relying solely on internal mechanisms or storytelling.
Incorporating multimodal integration, episodic memory, and interactive sociality is crucial for developing comprehensive and realistic cognitive models in AI and neuroscience.
Deep dives
The Surface Model of Behavior
The speaker discusses the idea that understanding behavior does not require looking inside, but rather focusing on surface-level observations. The gradual process of building models helps to explain behavior and can provide better explanations than relying solely on storytelling or simplistic parameters. The speaker highlights the importance of using statistics to understand behavior, both as a listener and as a speaker.
The Parallels between Deep Learning and the Brain
Initially skeptical of deep learning models, the speaker now sees parallels between them and the brain. Deep learning models, with their millions of parameters, can learn complex patterns without explicitly specifying features. By comparing deep learning to the brain, the speaker suggests that the brain may also rely on blind optimization and simple principles for learning. The speaker recognizes the need for models that capture the complexity of cognitive processes and incorporate the context in which language and behavior occur.
Missing Components in AI and Neuroscience
The speaker highlights the need for multimodal integration, including the integration of episodic memory-like structures and the role of sociality. Current models often focus on language and visual data, but fail to capture the richness of human experience. The speaker suggests that incorporating these missing components could lead to more comprehensive and realistic cognitive models, both in AI and in neuroscience research.
The Thousand Days Project and the Future of Animal Communication
The speaker introduces the Thousand Days Project, a research endeavor capturing the behavior and language development of babies from birth for 1,000 days. The goal is to understand language acquisition and model it from a baby's perspective. This project could also be applied to studying animal communication by capturing multimodal data, such as position tracking, visual data, and neural activity. By analyzing the data with language models, it may be possible to better understand the meaning and context of animal communication.
Uri Hasson runs a lab in Princeton, where he investigates the underlying neural basis of natural language acquisition and processing as it unfolds in the real world. As Uri visited Tübingen (where I am doing my master's), we were able to meet in person. Originally, I planned to talk about his idea of temporal receptive windows, and how different brain regions (e.g. default mode network) operate at different timescales. However, we ended up talking more about Wittgenstein, evolution, and ChatGPT. An underlying thread throughout the conversation was that (for both biological and artificial agents), language is not clever symbol and rule manipulation but a brute force fitting to statistics across (Wittgensteinian) 'contexts'. This view is best articulated in Uri's Direct Fit paper. We also connect this to transformers and discuss what's missing in AI. The answer here is multimodal integration, episodic memory, and interactive sociality). At the end, I ask Uri about his 1000 days project, talking to crows, and "understanding" in neuroscience/AI.
Hasson et al., 2015 - Hierarchical process memory: memory as an integral component of information processing Temporal receptive windows paper
Hasson et al., 2020 - Direct Fit to Nature: An Evolutionary Perspective on Biological and Artificial Neural Networks paper
Yeshurun et al., 2021 - The default mode network: where the idiosyncratic self meets the shared social world paper
Goldstein et al., 2022 - The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models preprint
Nguyen et al., 2022 - Teacher student neural coupling during teaching and learning paper
Goldstein et al., 2022 - Shared computational principles for language processing in humans and deep language models paper
Also mentioned:
Podcast episode with Tony Zador on Genomic Bottlenecks link