
4 - Risks from Learned Optimization with Evan Hubinger
AXRP - the AI X-risk Research Podcast
00:00
Training on a Huge Corpus of Data
How does a train process modify the model to start authorizing for that thing and code in its poon? And fundamentally, in the paper, we sort hov analyze two ways. How likely are these different things to occur? How well do they perform on things like simplicity? Are they like, you know, or least hap yo knoww ow?"
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.