LessWrong (Curated & Popular) cover image

"Two-year update on my personal AI timelines" by Ajeya Cotra

LessWrong (Curated & Popular)

CHAPTER

Explicitly Breaking Out GPT-N as an Anchor

Short horizon inefficiently trained coding models operating pretty close to their training distributions of massively accelerated AI research. I'm now explicitly putting significant weight on an amount of compute that's more like just scaling up language models to brainish sizes. This is consistent with doing RL fine tuning, but just needing many OOMs less data for that than for the original training run. And I think that's the most likely way it would manifest.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner