AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Learning Subsystem and the Steering Subsystem in the AGI
The steering subsystem sends a reward function up to the learning subsystem. The learning subsystem can learn a value function and then take good actions that are seem likely to lead to rewards. If you dig a little deeper into the reinforcement learning literature, you can find examples with multi dimensional value functions. So I think that there is just like a reward can sort of lead to a value function that anticipates the reward there can also be ground truth goosebumps that leads to anticipation of future goosebumps or ground truth cortisol that leads to anticipate of future cortisol.