Get the app
John Yang
Researcher and benchmark author focused on code evaluation and long-horizon AI coding agents; creator of SWE-bench and CodeClash and a Stanford PhD student working on human–AI collaboration and code evals.
Best podcasts with John Yang
Ranked by the Snipd community
318 snips
Dec 31, 2025
• 18min
[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang
chevron_right
Join John Yang, a Stanford PhD student and the mind behind SWE-bench and CodeClash, as he shares insights from the cutting-edge world of AI coding benchmarks. Discover how SWE-bench went from zero to industry standard in mere months, the limitations of traditional unit tests, and the innovative long-horizon tournaments of CodeClash. Yang dives into the debate around Tau-bench's 'impossible tasks' and explores the balance between autonomous agents and interactive workflows. Get ready for a glimpse into the future of human-AI collaboration!
The AI-powered Podcast Player
Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
Get the app