Machine Learning Street Talk (MLST) cover image

Nora Belrose - AI Development, Safety, and Meaning

Machine Learning Street Talk (MLST)

CHAPTER

Navigating Neural Network Modifications

This chapter explores the surgical adjustments of neural network representations, focusing on the balance required to maintain performance while modifying key concepts like part of speech. It discusses the implications of concept erasure in language models, revealing that while certain removals can increase error, models adapt by seeking alternative cues. Additionally, the chapter highlights experimental findings from the CIFAR-10 dataset, cautioning against over-reliance on techniques like Q-lease due to potential information leakage effects.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner