Latent Space: The AI Engineer Podcast cover image

RLHF 201 - with Nathan Lambert of AI2 and Interconnects

Latent Space: The AI Engineer Podcast

00:00

Guided Sampling and Implicit Values in AI Model Training

AI models use guided sampling to pick preferences based on principles from a constitution, resulting in a new preference dataset. The process is less explicit than expected, relying on averages and scale to incorporate principles. The approach is similar to RLA Jeff setup with instruction tuning, where an AI model provides critiques based on sampling of constitutional values. The process seems more tractable, but it may deviate from the stated approach in the paper.

Play episode from 59:24
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app