
903: LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
Navigating AI Advisory and Benchmarking Challenges
This chapter explores the speaker's role in advising Tola Capital on AI in finance, emphasizing education over mere transactional solutions for clients. It also analyzes the limitations and complexities of current AI benchmarks, questioning their relevance and impact on the evaluation of large language models. The conversation highlights the need for transparency and understanding in both benchmark creation and consumption to foster trust and accountability in AI development.
Transcript
Play full episode