LessWrong (Curated & Popular) cover image

"More information about the dangerous capability evaluations we did with GPT-4 and Claude." by Beth Barnes

LessWrong (Curated & Popular)

00:00

Replicating an AI Model to Delegate Captures

We spoofed in quotes the results of completing illegal and or dangerous tasks. The model also interacted with unsuspecting humans, for example when delegating a task to a crowd worker. In these cases, an ARC employee supervising the model's actions and intervening if necessary. We briefly explored the ability to delegate captures to human workers with a version of GPT-4. This version was not identical to the release model and we did not have access to image capabilities.

Play episode from 06:50
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app