
Introduction to AI Control
BlueDot Narrated
00:00
Untrusted Models Monitoring Each Other
Sarah discusses using multiple untrusted model instances to monitor one another and mitigation for collusion risks.
Play episode from 05:33
Transcript

Sarah discusses using multiple untrusted model instances to monitor one another and mitigation for collusion risks.