2min snip

No Priors: Artificial Intelligence | Technology | Startups cover image

Your AI Friends Have Awoken, With Noam Shazeer

No Priors: Artificial Intelligence | Technology | Startups

NOTE

Explanation of the last letter in ChatGPT: Transformer Based Model

The difference between an RNN and a transformer based or attention based model is that while RNNs process words sequentially and predict the next word based on the previous state, transformer models process the entire sequence at once and take advantage of parallelism. Transformers use attention, which creates a key-value associative memory and allows for looking up information in a fuzzy and differentiable way. This attention mechanism can be used not only for problems with two sequences, like machine translation, but also to look back at the past of the sequence being produced. Transformers are highly efficient on GPUs and TPUs, similar to how deep learning is thriving due to hardware advancements. The ability of transformers to handle sequences with different word orderings without information loss makes them an elegant solution.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode