AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Using Hindsight Action Replay to Train Value Estimates for Macro Actions
Peter came up with a clever approach called hindsight action replay. Instead of only storing macro actions in the replay buffer, we can also use sequences of primitive actions to construct macro actions. This allows us to train our value estimates for specific macro actions. It's a cool and innovative method that can be further enhanced by using different discounts.