Latent Space: The AI Engineer Podcast cover image

RLHF 201 - with Nathan Lambert of AI2 and Interconnects

Latent Space: The AI Engineer Podcast

00:00
NOTE

Different feedback types such as written feedback, labeling multiple scores, and pairwise preferences are expected to be used for different domains in AI development. Chain of thought reasoning and process reward models are suitable for math but may not be ideal for poetry. As AI tools improve, they become more domain-specific. Constitutional AI involves generating preference data by having a second model evaluate the outputs of the first model based on principles drawn from sources such as the UN Declaration of Human Rights and the Apple terms of service.

Play episode from 57:13
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app