TalkRL: The Reinforcement Learning Podcast cover image

Sven Mika

TalkRL: The Reinforcement Learning Podcast

00:00

Elliptorch and Tenser Flow Supported?

We don't have an internalora lip specific nimbloring framework. We just basically do everything twice. So the top em concept in our lip is the algorithm, which is completely frame prognostic. It determines when win you sampleo if it determines when things should happen. And then on the one level below you have te we have what we call the policy. And that one is framework specific. The different algorithms, for example, pp andicuen, they have their own lost functions,. Which are part of this policy, written in two ways. In pitroc, the problem of tense flow onewt the sessions and tense flo two with eager intent not using sessions and place

Play episode from 21:10
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app