
16 - Preparing for Debate AI with Geoffrey Irving
AXRP - the AI X-risk Research Podcast
How to Fine Tune R R L Code Base for Language Models
Tuning techniques that work for istead of r l from human preferences work in this case. If you have a language modelling like r l code base, you can apply it there. We also tried what's typically called upside down arel. It's justai standard supervised learning and reinforcement learning. Ou rythms applied to this atangry model. So they they generated failures, then you fine tuna findin finding the retime model on those failing thosh successfully attacking samples,. Ahrght Cool.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.