
S4E6: MIT’s James DiCarlo on Reverse-Engineering Human Sight with AI
Theory and Practice
The Feed Forward Process in Vision: Capturing Sub-Second Glimpses at narrow zone
CNNs are used to model the ventral stream in vision, focusing on the sub-second glimpses of images./nThe eye gathers information from a new location for about 200 milliseconds before moving on./nThe process of vision in these sub-second glimpses is a strongly feed-forward process./nModels for longer timescales may be needed to stitch together the outputs of the sub-second glimpses./nThe focus is on the central 10 degrees of vision in the ventral stream, not the peripheral vision./nThe spatial-temporal scale of the models is about 200 milliseconds and the central 10 degrees of vision.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.