AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Retort Learning
In RL, rewards come from the world. The environment is like the research decides that apple is worth 10 points. And when you eat it, you get 10 reward points. But we don't have quantification of our feeling of awe yet, right? It's not about qualia. It's not subjective. It's about the flow of inflation. That signal is produced by the agent. We need a different paradigm.