Nous Hermes 3 and exploiting underspecified evaluations

Aug 16, 2024

The discussion kicks off with the launch of a new model, questioning what defines a 'frontier model.' Notable comparisons are drawn with LAMA 3.1 and the importance of transparent evaluation metrics emerges. The conversation elaborates on valuable lessons learned from the training process of Hermes 3. The broader implications for technology policy are also highlighted, emphasizing the need for integrity in AI evaluations.

Ask episode

Chapters

Transcript

Episode notes

Evaluating the Noos Hermes III: Frontier Models and Evaluation Integrity

00:00 • 9min