The Inside View cover image

[JUNE 2022] Aran Komatsuzaki on Scaling, GPT-J and Alignment

The Inside View

00:00

How to Scale a Multi Task Training Procedure

GPT and T0 are some of the most like compute efficient models out there. And my project is basically trying to combine all these with scaling. So, yeah, GPT-3 doesn't perform as well as power but given the amount of compute it's spent, it consumes very well on many different tasks. Without using having to, I think it performs very well, even without using few short samples.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app