AXRP - the AI X-risk Research Podcast cover image

16 - Preparing for Debate AI with Geoffrey Irving

AXRP - the AI X-risk Research Podcast

CHAPTER

How to Fine Tune R R L Code Base for Language Models

Tuning techniques that work for istead of r l from human preferences work in this case. If you have a language modelling like r l code base, you can apply it there. We also tried what's typically called upside down arel. It's justai standard supervised learning and reinforcement learning. Ou rythms applied to this atangry model. So they they generated failures, then you fine tuna findin finding the retime model on those failing thosh successfully attacking samples,. Ahrght Cool.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner