The Learning Subsystem and the Steering Subsystem in the AGI

The steering subsystem sends a reward function up to the learning subsystem. The learning subsystem can learn a value function and then take good actions that are seem likely to lead to rewards. If you dig a little deeper into the reinforcement learning literature, you can find examples with multi dimensional value functions. So I think that there is just like a reward can sort of lead to a value function that anticipates the reward there can also be ground truth goosebumps that leads to anticipation of future goosebumps or ground truth cortisol that leads to anticipate of future cortisol.

Play episode from 41:19

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app