
Nous Hermes 3 and exploiting underspecified evaluations
Interconnects
00:00
Evaluating the Noos Hermes III: Frontier Models and Evaluation Integrity
This chapter explores the launch of Noos Hermes III and its classification as a Frontier Model, comparing it to LAMA 3.1. The discussion highlights the need for transparent evaluation metrics and the broader implications for future technology policy.
Transcript
Play full episode