
Latent Space: The AI Engineer Podcast World Models & General Intuition: Khosla's largest bet since LLMs & OpenAI
296 snips
Dec 6, 2025 Pim DeWitte, the visionary Founder and CEO of General Intuition, shares his insights on the future of AI. He discusses turning down a $500M offer from OpenAI to focus on building world models using action-labeled game clips. Pim reveals how training on game highlights fosters superhuman abilities in AI agents, making them capable of real-time action predictions. He explains the importance of episodic memory in learning, and his ambitious goal for spatial-temporal models to revolutionize AI interactions by 2030.
AI Snips
Chapters
Books
Transcript
Episode notes
Highlight Clips As Episodic Memory
- Medal accumulated 3.8 billion player highlight clips that capture peak human gameplay and actions.
- Pim DeWitte sees that dataset as a foundational "episodic memory" ideal for training spatial-temporal models.
Vision-Only Agent Demo
- Pim demoed a vision-only agent that sees raw frames and outputs actions in real time against humans.
- The model runs live and behaves like humans, sometimes even superhuman, despite no RL fine-tuning.
Frames-To-Actions Enables Video Transfer
- GI labels internet video by predicting actions from frames, enabling transfer from games to real-world clips.
- That lets any video become free training data once models map frames to action tokens.




