This Week in Startups cover image

OpenAI releases ChatGPT, a16z sunsets future.com + OK Boomer with Jack Raines | E1626

This Week in Startups

00:00

Reward Models for Reward Learning

We trained an initial model. They wrote using supervised fine tuning human AI, trying trainers provided conversations in which they played both sides. We gave the trainers access to model written suggestions to help them compose their responses and create a reward model for reinforcement learning. The bot actually a positive reinforcement really works. I don't know if they shocked me.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app