Super Data Science: ML & AI Podcast with Jon Krohn cover image

802: In Case You Missed It in June 2024

Super Data Science: ML & AI Podcast with Jon Krohn

00:00

Implications of Model Fragility and Safety in AI Systems

This chapter dives into the delicate nature of RLHF models and the risks associated with removing safety layers from models during fine-tuning. It highlights how adjusting models for specific tasks can compromise safety features initially developed during pre-training, stressing the need to view safety as integral to the entire AI system.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app