
Sven Mika
TalkRL: The Reinforcement Learning Podcast
00:00
Elliptorch and Tenser Flow Supported?
We don't have an internalora lip specific nimbloring framework. We just basically do everything twice. So the top em concept in our lip is the algorithm, which is completely frame prognostic. It determines when win you sampleo if it determines when things should happen. And then on the one level below you have te we have what we call the policy. And that one is framework specific. The different algorithms, for example, pp andicuen, they have their own lost functions,. Which are part of this policy, written in two ways. In pitroc, the problem of tense flow onewt the sessions and tense flo two with eager intent not using sessions and place
Play episode from 21:10
Transcript


