2min snip

MIT Technology Review Narrated cover image

Large language models can do jaw-dropping things. But nobody knows exactly why.

MIT Technology Review Narrated

NOTE

Rethink Complexity in Deep Learning

Progress in understanding deep learning continues, yet many questions remain unresolved. Recent research suggests that grokking and double descent may be interconnected phenomena, emphasizing the need for explanations that encompass both. Contrarily, some researchers challenge the validity of double descent, asserting it may stem from flawed measures of model complexity. They argue that the number of parameters alone does not accurately reflect a model's complexity, as this can vary based on usage and interaction during training. A reconsideration of complexity metrics could lead to a better grasp of large model behaviors, indicating that existing mathematical frameworks may suffice to explain these phenomena despite our limited understanding of model dynamics at scale.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode