AI Safety Fundamentals: Alignment cover image

AI Safety via Debate

AI Safety Fundamentals: Alignment

00:00

Introduction

This is an audio version of AI Safety via Debate by Jeffrey Irving, Paul Cristiano, and Dario Amade. It was published on 22 October 2018. This is an excerpt that's included as one of the core readings from the AGI Safety Fundamentals course. We propose training agents via self-play on a zero-sum debate game. Given a question or proposed action, two agents take turns making short statements up to a limit. Then a human judges which of the agents gave most true useful information.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app