AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Markov Decision making with Partial Observability of limited understanding
The use of Markov Decision Processes (MDPs) is crucial in modeling systems for control and in reinforcement learning, where the current state provides all necessary information to make decisions. In MDPs, actions in a state lead to a distribution over the next possible states. Adding a reward function in reinforcement learning evaluates the desirability of being in a state or executing an action. However, in real-world scenarios, it's common to deal with Partially Observable Markov Decision Processes (POMDPs) due to incomplete knowledge of the world state. POMDPs involve getting a distribution over observations given a state, which only provides partial information. As a result, keeping track of all past observations becomes essential to maintain a comprehensive understanding of the world. This framework is especially useful when dealing with tasks that require remembering previous information to make informed decisions.