The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484

May 17, 2021

In this discussion, Konstantin Rusch, a PhD student at ETH Zurich, dives into innovative recurrent neural networks (RNNs) aimed at tackling long-time dependencies. He shares insights from his papers on coRNN and uniCORNN, inspired by neuroscience, and how these architectures compare to traditional models like LSTMs. Konstantin also reveals challenges in ensuring gradient stability and innovative techniques that enhance RNNs' expressive power. Plus, he discusses his ambitions for future advancements in memory efficiency and performance.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Vanishing/Exploding Gradient Problem

RNNs struggle to learn long-time dependencies due to the vanishing/exploding gradient problem during backpropagation.
This arises from long products in the gradient calculation, which shrink or grow exponentially with sequence length.

INSIGHT

Neuroscience-Inspired RNNs

Konstantin's approach to RNNs is inspired by neuroscience, specifically the oscillatory behavior of neurons.
Oscillatory behavior offers stability, both in the system's state and its gradient, potentially enabling long-time dependency learning.

ANECDOTE

coRNN Performance on Adding Problem

Konstantin benchmarked coRNN on the adding problem, a synthetic long-time dependency task.
coRNN achieved direct convergence even with sequences of length 5,000, outperforming LSTMs which failed at shorter lengths.

Get the Snipd Podcast app to discover more snips from this episode

Get the app