#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

13224 snips

Feb 3, 2025

Guest

Dylan Patel

Guest

Nathan Lambert

Dylan Patel, founder of SemiAnalysis, and Nathan Lambert, research scientist at the Allen Institute for AI, dive into the intricate world of AI and semiconductors. They discuss the implications of China's DeepSeq AI models, the evolving geopolitical landscape, and how export controls impact technology competition. The conversation reveals fascinating insights about AI model architectures, including mixture of experts models, and the challenges of training and optimization. They also ponder the role of transparency and ethics in AI development, shaping the future of this transformative technology.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

DeepSeek Models Overview

DeepSeek open-weights models like V3 and R1 are instruction and reasoning models, respectively.
These models, trained on large text data, offer similar performance to OpenAI's but at a lower cost and with open weights.

INSIGHT

DeepSeek V3 vs. R1

DeepSeek V3 base is a pre-trained model that undergoes different post-training processes for instruction (V3) and reasoning (R1).
R1's reasoning process is visible to users, unlike OpenAI's models, making it stand out.

ANECDOTE

DeepSeek R1's Philosophical Insight

Lex Fridman tested DeepSeek R1 with a philosophical question.
R1's reasoning process was visible, culminating in a profound insight about humans' shared "hallucinations."

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware. Nathan Lambert is a research scientist at the Allen Institute for AI (Ai2) and the author of a blog on AI called Interconnects.
Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep459-sc
See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc.

Transcript:
https://lexfridman.com/deepseek-dylan-patel-nathan-lambert-transcript

CONTACT LEX:
Feedback – give feedback to Lex: https://lexfridman.com/survey
AMA – submit questions, videos or call-in: https://lexfridman.com/ama
Hiring – join our team: https://lexfridman.com/hiring
Other – other ways to get in touch: https://lexfridman.com/contact

EPISODE LINKS:
Dylan’s X: https://x.com/dylan522p
SemiAnalysis: https://semianalysis.com/
Nathan’s X: https://x.com/natolambert
Nathan’s Blog: https://www.interconnects.ai/
Nathan’s Podcast: https://www.interconnects.ai/podcast
Nathan’s Website: https://www.natolambert.com/
Nathan’s YouTube: https://youtube.com/@natolambert
Nathan’s Book: https://rlhfbook.com/

SPONSORS:
To support this podcast, check out our sponsors & get discounts:
Invideo AI: AI video generator.
Go to https://invideo.io/i/lexpod
GitHub: Developer platform and AI code editor.
Go to https://gh.io/copilot
Shopify: Sell stuff online.
Go to https://shopify.com/lex
NetSuite: Business management software.
Go to http://netsuite.com/lex
AG1: All-in-one daily nutrition drinks.
Go to https://drinkag1.com/lex

OUTLINE:
(00:00) – Introduction
(13:28) – DeepSeek-R1 and DeepSeek-V3
(35:02) – Low cost of training
(1:01:19) – DeepSeek compute cluster
(1:08:52) – Export controls on GPUs to China
(1:19:10) – AGI timeline
(1:28:35) – China’s manufacturing capacity
(1:36:30) – Cold war with China
(1:41:00) – TSMC and Taiwan
(2:04:38) – Best GPUs for AI
(2:19:30) – Why DeepSeek is so cheap
(2:32:49) – Espionage
(2:41:52) – Censorship
(2:54:46) – Andrej Karpathy and magic of RL
(3:05:17) – OpenAI o3-mini vs DeepSeek r1
(3:24:25) – NVIDIA
(3:28:53) – GPU smuggling
(3:35:30) – DeepSeek training on OpenAI data
(3:45:59) – AI megaclusters
(4:21:21) – Who wins the race to AGI?
(4:31:34) – AI agents
(4:40:16) – Programming and AI
(4:47:43) – Open source
(4:56:55) – Stargate
(5:04:24) – Future of AI