Latent Space: The AI Engineer Podcast cover image

ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt)

Latent Space: The AI Engineer Podcast

00:00

Enhancing AI Reasoning with Supervised Feedback Mechanisms

This chapter explores outcome and process supervision in training reward models, focusing on how process supervision enhances reasoning in AI, particularly in mathematics. It highlights the benefits of detailed feedback and discusses the implications of dealing with large search spaces and the potential applications of newly collected data.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app