AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Weirdest Furthest Out in a Little While
The path from the feedback signal to arrive at each part of the model is shorter than in the very deep recurrent setting. That means you learn better. It's not really like that, but it's fun to think about. If you have a bunch of tape measures of equal length, they work full down. I interrupted you to ask a dumb question about what tokens we're going over. The charitable interpretation which I'm going to give myself is that that was the most fascinating thing happening over the past few years.