AXRP - the AI X-risk Research Podcast cover image

22 - Shard Theory with Quintin Pope

AXRP - the AI X-risk Research Podcast

CHAPTER

The Effect of Training Processes on Downstream Behaviors

I think there's like lots of room for scaling up very hands-off behavioral supervision and then once you have that sort of tooling available to you it seems like you can get a much better hand on how different training processes influence downstream behaviors. This high level overview of how all those different training processes change the model's behavior puts you in a much better position for like iterating on how to train models into doing things we want them to do off training distribution inputs okay so that's like what I'm currently working on and who are you working on that with? um three peopleUm so like two of them were Matt's Scholars Roman, Roman, Engler and Owen Dundly Matt Scholars from

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner