Interconnects cover image

Interconnects

Interviewing Tri Dao and Michael Poli of Together AI on the future of LLM architectures

Dec 21, 2023
Tri Dao, an incoming professor at Princeton and Chief Scientist at Together AI, joins Michael Poli, a Stanford PhD graduate and research scientist at Together AI. They dive into why traditional attention mechanisms may not scale effectively and introduce innovative models like Striped Hyena and Mamba. The duo discusses hardware optimization for these architectures and predicts exciting developments in AI for 2024, challenging the dominance of current transformer models. Their insights reflect a transformative wave in machine learning.
35:47

Podcast summary created with Snipd AI

Quick takeaways

  • The podcast highlights a shift towards non-attention architectures like state space models, which offer efficiency and better long-range data handling.
  • Experts anticipate a future where hybrid architectures integrate features of both attention-based and non-attention-based systems for optimized AI performance.

Deep dives

The Rise of Non-Attention Architectures

The discussion highlights a growing interest in non-attention architectures for AI, particularly as media coverage of AI expands. These architectures are viewed positively due to their efficiency and capability to handle long-range dependencies in data processing, presenting a contrast to traditional transformer models, which rely heavily on attention mechanisms. Researchers emphasize that even though transformers have shown remarkable scalability and effectiveness, there are limitations associated with their quadratic scaling costs, which may hinder performance with longer input sequences. This push for alternatives represents a significant shift in AI model design, directing attention toward newer techniques that may efficiently manage sequence processing without the downsides of attention.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner