Dwarkesh Podcast cover image

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Dwarkesh Podcast

CHAPTER

Exploring Model Distillation and Adaptive Computation

This chapter investigates the intricate process of model distillation, highlighting its efficiency over conventional training methods. It also addresses the role of adaptive computation and techniques like chain of thought in enhancing models' reasoning capabilities.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner