3min snip

Machine Learning Street Talk (MLST) cover image

Can we build a generalist agent? Dr. Minqi Jiang and Dr. Marc Rigter

Machine Learning Street Talk (MLST)

NOTE

Markov Decision making with Partial Observability of limited understanding

The use of Markov Decision Processes (MDPs) is crucial in modeling systems for control and in reinforcement learning, where the current state provides all necessary information to make decisions. In MDPs, actions in a state lead to a distribution over the next possible states. Adding a reward function in reinforcement learning evaluates the desirability of being in a state or executing an action. However, in real-world scenarios, it's common to deal with Partially Observable Markov Decision Processes (POMDPs) due to incomplete knowledge of the world state. POMDPs involve getting a distribution over observations given a state, which only provides partial information. As a result, keeping track of all past observations becomes essential to maintain a comprehensive understanding of the world. This framework is especially useful when dealing with tasks that require remembering previous information to make informed decisions.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode