Exploring Model Safety and Ablations

This chapter discusses the speakers' doubts about the effectiveness of enumerative safety and the challenges posed by superhuman models. It also explores the concept of ablations and proposes retraining the model and augmenting smaller models with explanations for better performance.

Play episode from 20:12

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app