Astral Codex Ten Podcast cover image

Constitutional AI: RLHF On Steroids

Astral Codex Ten Podcast

00:00

Introduction

AIs, like GPT4, go through several different types of training. The usual method is to make human crowd workers rate thousands of AI responses as good or bad. Anthropic announces a surprising update to this process: What if the AI gives feedback to itself? It's called Constitutional AI because the prompt in step 2 can be a sort of constitution for the AI. And these constitutional approaches seem to reach higher levels of harmlessness for a given level of helpfulness.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app