
Rohin Shah
TalkRL: The Reinforcement Learning Podcast
00:00
Do You Have a Research Career Plan?
The plan is to train models using human feedback, and then like mpower tthe humans providing the feedback as much as he can. Roan: Knowing everything that the model knows is a pretty high bar, and probably we won't get to it. But they are like a bunch of tricks that we can do that get us closer and closer to it. So ye, please do apply if you are interested in working on the a alinment.
Play episode from 01:34:34
Transcript


