LessWrong (Curated & Popular) cover image

'Simulators' by Janus

LessWrong (Curated & Popular)

00:00

GPT-2 - The Importance of Scalability

The earliest engagement with the hypothetical of what if self supervised sequence modeling actually works that I know of is a terse post from 2019, Implications of GPT-2 by Gherkin-Gloss. The best models are the largest we were able to fit into a GPU memory back to the text. If this passes, we could immediately implement an HCH of AI safety researchers solving the problem if it's within our reach at all.

Play episode from 09:57
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app