The Attention Mechanism with Andrew Mayne

Nvidia Gets Groq (Not That One)

9 snips
Jan 7, 2026
Nvidia's massive $20 billion acquisition of Groq sets the stage for a deep dive into AI chip technology. Learn how Groq's specialized chips outperform standard GPUs in inference tasks. The hosts explore futuristic predictions for 2026, emphasizing a shift towards distributed inference data centers. They also debate the viability of OpenAI's consumer hardware, questioning whether Jony Ive's involvement can deliver results by year-end. Expect insights on the booming demand for AI compute and the strategic investments shaping the industry.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Inference Chips Are Becoming Specialized

  • Groq built language-focused inference chips (LPUs) that run reasoning models far faster than typical GPUs.
  • NVIDIA's $20B non-exclusive deal pulls that speed into a company that can scale production and deployment quickly.
INSIGHT

Two Paths To Post‑TPU Speed

  • Wafer-scale designs (Cerebras) and modular LPUs (Groq) took different hardware paths to speed inference.
  • Both approaches highlight a post-TPU shift where architecture choices optimize latency, scale, and rack compatibility.
ANECDOTE

Personal Use Of Groq Cloud For Fast Speech‑to‑Text

  • Andrew Mayne uses Groq Cloud for extremely fast speech‑to‑text when he doesn't run local models.
  • He cites the GPD OSS 120B model and Groq's speed as practical reasons to choose their cloud service.
Get the Snipd Podcast app to discover more snips from this episode
Get the app