When Models Perform Better on Verifiable vs. Open-Ended Tasks

Demetri asks why reasoning models excel on math but can struggle in open-ended conversation; Nathan contrasts verifiable benchmarks with subjective domains.

Play episode from 36:44

chevron_right

Transcript

chevron_right

Transcript

Episode notes

In Episode 448 of Hidden Forces, Demetri Kofinas speaks with Nathan Benaich, founder and general partner of Air Street Capital and the creator of the annual State of AI Report, an open-access compendium that tracks advances across AI research, industry, policy, and geopolitics.

Nathan Benaich and Demetri spend the first hour of their conversation exploring some of the most important AI breakthroughs of the year. They unpack the DeepSeek moment, dig into some of the advancements made by the latest reasoning models, and discuss why there appears to be a regression in capabilities across certain domains in artificial intelligence at the same time as we are seeing marked improvements in reasoning-heavy use cases like coding and scientific research.

The second hour turns to a conversation about the commercial implications and geopolitical dynamics of the AI arms race, including China's strategy to become the leader in open-weight models and tooling. They look at what industries, sectors, and professions may be most ripe for disruption, where the investment opportunities are, whether we're in a bubble comparable to the 1990s Internet boom, and how export controls, energy constraints, and regulatory red-tape could play an outsized role in shaping the trajectory of the current arms race.

Lastly, Kofinas and Benaich examine where along the AI stack most of the value is likely to accrue—from the underlying picks and shovels, through the foundation models, to the apps that ride on top of them—and what all this means for labor markets, education, and the cadence of scientific discovery.

Subscribe to our premium content—including our premium feed, episode transcripts, and Intelligence Reports—by visiting HiddenForces.io/subscribe.

If you'd like to join the conversation and become a member of the Hidden Forces Genius community—with benefits like Q&A calls with guests, exclusive research and analysis, in-person events, and dinners—you can also sign up on our subscriber page at HiddenForces.io/subscribe.

If you enjoyed today's episode of Hidden Forces, please support the show by:

Subscribing on Apple Podcasts, YouTube, Spotify, Stitcher, SoundCloud, CastBox, or via our RSS Feed
Writing us a review on Apple Podcasts & Spotify
Joining our mailing list at https://hiddenforces.io/newsletter/

Producer & Host: Demetri Kofinas Editor & Engineer: Stylianos Nicolaou

Subscribe and support the podcast at https://hiddenforces.io. Join the conversation on Facebook, Instagram, and Twitter at @hiddenforcespod Follow Demetri on Twitter at @Kofinas

Episode Recorded on 10/29/2025

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Home Top podcasts Popular guests Top books