2min chapter

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

AI Trends 2023: Reinforcement Learning - RLHF, Robotic Pre-Training, and Offline RL with Sergey Levine - #612

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

CHAPTER

Is a Language Model a Model?

We previously spoke a little bit about one shortcoming of this preference model and the RLH of approach as being not aware of time and sequence and that kind of thing. Is the problem that you just described? Is it kind of another angle on the same problem or are those kind of two separate shortcomings of the way RL has been applied to language models thus far? I think we've only scratched the surface of the potential for that because right now the preference of stuff is really trying to get these models to act more like people, to sort of mimic people. But I think they can do a lot more than that. They can actually use their deep understanding of the patterns in human behavior to be

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode