The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484

May 17, 2021
In this discussion, Konstantin Rusch, a PhD student at ETH Zurich, dives into innovative recurrent neural networks (RNNs) aimed at tackling long-time dependencies. He shares insights from his papers on coRNN and uniCORNN, inspired by neuroscience, and how these architectures compare to traditional models like LSTMs. Konstantin also reveals challenges in ensuring gradient stability and innovative techniques that enhance RNNs' expressive power. Plus, he discusses his ambitions for future advancements in memory efficiency and performance.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Vanishing/Exploding Gradient Problem

  • RNNs struggle to learn long-time dependencies due to the vanishing/exploding gradient problem during backpropagation.
  • This arises from long products in the gradient calculation, which shrink or grow exponentially with sequence length.
INSIGHT

Neuroscience-Inspired RNNs

  • Konstantin's approach to RNNs is inspired by neuroscience, specifically the oscillatory behavior of neurons.
  • Oscillatory behavior offers stability, both in the system's state and its gradient, potentially enabling long-time dependency learning.
ANECDOTE

coRNN Performance on Adding Problem

  • Konstantin benchmarked coRNN on the adding problem, a synthetic long-time dependency task.
  • coRNN achieved direct convergence even with sequences of length 5,000, outperforming LSTMs which failed at shorter lengths.
Get the Snipd Podcast app to discover more snips from this episode
Get the app