
How Intelligent Is AI, Really?
Y Combinator Startup Podcast
00:00
False positives: benchmarking pitfalls
Greg warns against RL-environment overfitting and vanity metrics, urging work on true generalization rather than benchmark hacking.
Play episode from 04:50
Transcript


