AXRP - the AI X-risk Research Podcast cover image

19 - Mechanistic Interpretability with Neel Nanda

AXRP - the AI X-risk Research Podcast

CHAPTER

Scaling Laws and Deep Learning

The deep minds chinchilla paper the main interesting results was that everyone was taking models that were too big and training them on two little data. They made a 70 billion parameter model that was about as good as Google Brain's Palm which is 600 billion but with notably less compute. Yes I will counsel that I think parameters are somewhat overrated as a way of gauging model capability. The scaling laws work has been fairly net negative and has been used by people just trying to push the frontier capabilities though I don't have great insight into these questions.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner