AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Is There a Sample Efficiency?
It actually takes a bit more time to train or to evolve the policy for to be able to perform the taskb. The way i think about these issues is that there is a few dimensions you can work on optimising the sample cy like maybe reducing an r l algrithm from 200 million time steps to 100 million time steps t achieve some score. Or you can think at o sample efficiency in terms of zero shot transfer. So one can argue that oki spent all this time figuring out the policy on using hard attention in this paper, but if you give anew environment, which is not the same as the original environment, but one that has has some augmentation to it, we