
The Attention Mechanism with Andrew Mayne Nvidia Gets Groq (Not That One)
9 snips
Jan 7, 2026 Nvidia's massive $20 billion acquisition of Groq sets the stage for a deep dive into AI chip technology. Learn how Groq's specialized chips outperform standard GPUs in inference tasks. The hosts explore futuristic predictions for 2026, emphasizing a shift towards distributed inference data centers. They also debate the viability of OpenAI's consumer hardware, questioning whether Jony Ive's involvement can deliver results by year-end. Expect insights on the booming demand for AI compute and the strategic investments shaping the industry.
AI Snips
Chapters
Books
Transcript
Episode notes
Inference Chips Are Becoming Specialized
- Groq built language-focused inference chips (LPUs) that run reasoning models far faster than typical GPUs.
- NVIDIA's $20B non-exclusive deal pulls that speed into a company that can scale production and deployment quickly.
Two Paths To Post‑TPU Speed
- Wafer-scale designs (Cerebras) and modular LPUs (Groq) took different hardware paths to speed inference.
- Both approaches highlight a post-TPU shift where architecture choices optimize latency, scale, and rack compatibility.
Personal Use Of Groq Cloud For Fast Speech‑to‑Text
- Andrew Mayne uses Groq Cloud for extremely fast speech‑to‑text when he doesn't run local models.
- He cites the GPD OSS 120B model and Groq's speed as practical reasons to choose their cloud service.



