AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
How to Control the Objectives of an AI System
For modern ML systems, we don't get to explicitly state a systems' objectives. Instead, we reward or punish a system in a training environment so that it learns on its own. This raises a number of difficulties, one of which is goal misgeneralization. We'll look at a more specific example of how problems with proxies could lead to an existential catastrophe here.