Understanding Neural Networks through Stacked Autoencoders

This chapter explores the functionality of Stacked Autoencoders (SAEs) in neural networks, highlighting their importance in parameter-based interpretability. The chapter explains how SAEs manage complex representations through a wide layer and focuses on the training process that balances sparsity with model behavior to identify key features.

Play episode from 35:00

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app