The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The Unreasonable Effectiveness of the Forget Gate with Jos Van Der Westhuizen - TWiML Talk #240

Mar 18, 2019

Jos Van Der Westhuizen, a PhD student at Cambridge University, discusses his immersive journey from biomedical engineering to machine learning. He dives into the importance of the forget gate in LSTMs, revealing how it boosts computational efficiency. The conversation also covers his innovative architecture, Janet, which combines attention mechanisms with LSTMs. Jos emphasizes selective learning and how managing what to forget is key in optimizing neural networks. Tune in to hear about the future of simpler, more efficient neural network designs!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Accidental ML Journey

Jos van der Westhuizen's path to machine learning was accidental, starting in biomedical engineering and computational neuroscience.
He initially aimed to create a wristwatch for comprehensive health diagnostics, but pivoted towards machine learning after encountering temporal modeling techniques.

INSIGHT

LSTM Gates and Gradients

Recurrent neural networks (RNNs) face gradient problems due to conflicting updates through the same edges during backpropagation.
LSTMs use input, output, and forget gates to mitigate these issues, enabling more effective memory management.

INSIGHT

LSTM Gate Functions

The typical LSTM utilizes three gates: input, output, and forget gates.
These gates control information flow: input at each timestep, output to the next cell, and how much information to forget.

Get the Snipd Podcast app to discover more snips from this episode

Get the app