
On GPT-4.5
Don't Worry About the Vase Podcast
00:00
Evaluating GPT-4.5: Benchmarks and Beyond
This chapter scrutinizes the effectiveness of current benchmarks in assessing AI model performance, specifically focusing on GPT-4.5. The speakers discuss the model's advancements and the challenges it poses to traditional evaluation methods, while also exploring user experiences and perceptions. Ultimately, the conversation reflects on the mixed reception of GPT-4.5, highlighting its strengths and perceived shortcomings in comparison to earlier versions.
Play episode from 24:07
Transcript


