
A recipe for frontier model post-training
Interconnects
00:00
Revolutionizing RLHF: The New Standards in Model Training
This chapter examines the recent progress in reinforcement learning from human feedback, particularly with the Llama 3.1 model. It emphasizes the role of synthetic data, iterative training, and data filtering in refining RLHF outcomes and notes the shared views among tech companies on evolving training standards.
Transcript
Play full episode