3min chapter

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

Learning to Ponder: Memory in Deep Neural Networks with Andrea Banino - #528

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

CHAPTER

Transformers and Retort Learning

Inarel, we keep using elestm essentially for doing most of the task. But we know that they suffer from what is called a regency bias. So one option would be to use transformer because it can end a long term contacts. However, the reward are sparser and gradient has been shown to be noisier so it's difficult to train so many weights. What we did in that work was basically to generalize the birt training to which, you know, is done on token those are like categorical numbers,. On the other side, we generalize the bird masking to real value numbers input, so to basicalfintures. We send the features from

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode