
Constitutional AI: RLHF On Steroids
Astral Codex Ten Podcast
00:00
Introduction
AIs, like GPT4, go through several different types of training. The usual method is to make human crowd workers rate thousands of AI responses as good or bad. Anthropic announces a surprising update to this process: What if the AI gives feedback to itself? It's called Constitutional AI because the prompt in step 2 can be a sort of constitution for the AI. And these constitutional approaches seem to reach higher levels of harmlessness for a given level of helpfulness.
Transcript
Play full episode