TalkRL: The Reinforcement Learning Podcast

Natasha Jaques 2

Mar 14, 2023

Ask episode

Chapters

Transcript

Episode notes

The Challenges and Limitations of RLHF

Is This Data Set Reusable?

Optimize for Reward, but Maximize the Reward Function

The Reward Model Isn't Perfect, Right?

Token Level Probabilities

Recursive Reward Models - Are We Going to Need That Soon?

Are You Up for Talking About AGI?

Chat GPT - Robotics Is Super Hard

Do You Really Know What You're Doing?

Is There a Language Model That Can Understand Language?

Is It a Good Idea to Go Back to Academia?

Do You Have a Clear Idea of AI?

Social Learning Versus Imitative Learning

Adaptive Online Generalization for Self Driving Cars

Is There a Distractor in Deep Learning?