80,000 Hours Podcast cover image

#23 - How to actually become an AI alignment researcher, according to Dr Jan Leike

80,000 Hours Podcast

00:00

Training AI with Human Feedback

This chapter explores the process of training a machine learning algorithm using human feedback. They discuss the challenges of feedback in AI alignment research and the potential failures in the training system. They also describe a three-part training process involving humans ranking video clips, a reward predictor learning how humans would rank different behaviors, and a reinforcement learning algorithm maximizing the predicted reward.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app