When we train AI's training on machine learning systems the training regime is set up in a way that can call all its monitors what the AI does. If it does the correcting based on based on some criteria then it gets like rewarded if it does the like incorrect thing then it gets punished in some sense and more generally  the essence of reinforcement learning. It will get it's it brain modified to be more likely to do the right thing in the similar situations in the in the future when I ask for help with an algorithm. We are using a loss function to create like this landscape but you define loss function by the way just for people yeah this is going to go into the weeds or can

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode