“METR’s Evaluation of GPT-5” by GradientDissenter

12 snips

Aug 8, 2025

Gradient Dissenter, who works at METR and played a key role in evaluating GPT-5, discusses the thorough safety analysis conducted on the AI model prior to its launch. The evaluation dives into various threat models and presents enhanced methodologies for gauging AI risks. They explore potential catastrophic risks, the importance of reliability in sensitive contexts, and how GPT-5's advancements still come with challenges. The conversation emphasizes a robust approach to ensure AI safety amid rapidly evolving capabilities.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Access To Reasoning Traces

Gradient Dissenter reports METR received GPT-5 reasoning traces and background info under NDA.
They used these to strengthen their risk assessment.

INSIGHT

METR's Overall Risk Conclusion

METR judged GPT-5 unlikely to cause catastrophic risk via three threat models.
They base this on time-horizon estimates and OpenAI assurances.

ADVICE

Clear Thresholds For Concern

METR lists concrete capability thresholds that should trigger deep review.
Use these thresholds to prompt targeted evaluations before deployment.

Get the Snipd Podcast app to discover more snips from this episode

Get the app