L S T M, Long Short Term Memory

Each l s t m in your hidden layer, or however many hidden layers you have, is going to latch on to a specific subsequence in a sequence. So let's say the sentence is, after work, i'm going to go get my license at the d m v. Then i going to go grab some drinks with friends and then i have to work the rest of the night. The's kind of three separate thing s happening in this sentence. It actually indeed could suffer from the vanishing gradient problem.

Transcript

Play full episode

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app