The Gradient: Perspectives on AI cover image

Thomas Dietterich: From the Foundations

The Gradient: Perspectives on AI

NOTE

**The unstable Reward Function of Cell Tower Operators for Reinforcement Learning **

In a collaboration with George Tramponius from Huawei, we tackled the problem of ignoring certain state variables when controlling cell towers. While formulating it as a reinforcement learning problem, we realized that factors like customer behavior during rush hour or in low-demand times were beyond the tower's control. We discovered that the controller could only make a difference to customers at the margins. This led us to understand that the state of the system included external variables, even though the reward function depended on them.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner