Why reward models are still key to understanding alignment

Feb 14, 2024

07:44

forum

Ask episode

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

In an era dominated by direct preference optimization and LLMasajudge, why do we still need a model to output only a scalar reward?
This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
Source code: https://github.com/natolambert/interconnects-tools
Original post: In an era dominated by direct preference optimization and LLM-as-a-judge, why do we still need a model to output only a scalar reward?

Podcast figures:
Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_004.png
Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_009.png

0:00 Why reward models are still key to understanding alignment

This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

Home Top podcasts Popular guests Top books