AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Reward Modeling for Prosaic Alignment
I think we still have a lot of work to do in terms of educating people in academia about like the safety concerns and, um, you know, winning hearts and minds is how I put it. And that's especially true for machine learning within the existential safety community and vice versa. So I'm pretty pessimistic because I don't think the technical approaches can be solved by these kinds of systems. We're going to need some ability to coordinate and say, "Let's not pursue this path or let's not deploy these types of systems" That kind of system seems really high and doesn't seem safe at all.