Latent Space: The AI Engineer Podcast cover image

Cursor.so: The AI-first Code Editor — with Aman Sanger of Anysphere

Latent Space: The AI Engineer Podcast

NOTE

Scaling Up Attention in Large Models

The issue with long context length in models is that the cost scales linearly, but attention becomes negligible compared to the feed forward part for large models. It's expensive to pay for large numbers of tokens. There might be a better approach for long context in the future. State space models offer parallelization benefits and efficient training.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner