Don't Worry About the Vase Podcast cover image

AI #135: OpenAI Shows Us The Money

Don't Worry About the Vase Podcast

00:00

SWE Bench Pro Raises Coding Agent Bar

Introduction of SWE Bench Pro and how it tests real enterprise-grade multi-file coding tasks revealing low current agent performance.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app