

Grok4's Leap And Meta's Strategic Moves | The Brainstorm EP 94
48 snips Jul 16, 2025
In this engaging discussion, Jozef Soja, a Research Analyst at ARK Invest specializing in AI, sheds light on Grok4's groundbreaking benchmark results and its real-world applications. The conversation reveals the challenges still faced by Grok4, despite its impressive theoretical performance. They also delve into Meta's aggressive investments in AI, showcasing its strategic acquisitions, including stakes in Scale AI. Listeners gain insights into the future of AI technology and navigate the rapidly evolving landscape of innovation.
AI Snips
Chapters
Transcript
Episode notes
Grok4 Sets New AI Benchmarks
- Grok4 leads AI benchmarks with advanced reasoning and tool use, scoring notably high on GPQA and Humanity's Last Exam.
- Early and intensive investment in reinforcement learning and tool integration are key to its performance gains.
Grok4 Excels in Simulation
- Grok4 outperforms humans and other AI on the simulated vending machine test, showing advanced long-term strategy and common sense.
- Despite simulated success, real-world applications remain challenging due to unexpected common sense errors.
Benchmark Leaders Differ From Usage
- Top AI models used by developers differ from leading benchmarks due to cost and tooling specialization factors.
- Cheaper models like Google's Gemini see more usage despite not always leading in benchmark performance.