Using GPT to Accelerate Alignment Research

Open AI's focus with augmentations is very much fixing bugs in quotes with how GPT behaves. As a result, instead of being able to interact with the probabilistic world model directly, we are forced to interaction with the black box agentic process. Everything becomes filtered through the preferences and biases of that process. Notice that these are all things that GPT is very poorly suited for, but humans find quite easy when they want to.

Play episode from 15:26

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app