AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
I'm Not Trying to Deceive Evolution, Right?
This is the kind of thing, i's tha problem here that we're like, out of distribution because now he's doing other things. And you could call this self modification or self improvement, or whatever it's or building a sub agent rate um. But either way, just the allignment property don't we don't currently have a way to be confident that the alinment properties hold through. There's a lot of cool research about how to align our current systems, that doesn't solve this problem. Ah, and my main line prediction is that we get alignment stuff that looks pretty good and works pretty well until it doesn't.