LW - Notes on Dwarkesh Patel's Podcast with Sholto Douglas and Trenton Bricken by Zvi

Apr 2, 2024

Ask episode

Chapters

Transcript

Episode notes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Notes on Dwarkesh Patel's Podcast with Sholto Douglas and Trenton Bricken, published by Zvi on April 2, 2024 on LessWrong.
Dwarkesh Patel continues to be on fire, and the podcast notes format seems like a success, so we are back once again.
This time the topic is how LLMs are trained, work and will work in the future. Timestamps are for YouTube. Where I inject my own opinions or takes, I do my best to make that explicit and clear.
This was highly technical compared to the average podcast I listen to, or that Dwarkesh does. This podcast definitely threated to technically go over my head at times, and some details definitely did go over my head outright. I still learned a ton, and expect you will too if you pay attention.
This is an attempt to distill what I found valuable, and what questions I found most interesting. I did my best to make it intuitive to follow even if you are not technical, but in this case one can only go so far. Enjoy.
(1:30) Capabilities only podcast, Trenton has 'solved alignment.' April fools!
(2:15) Huge context tokens is underhyped, a huge deal. It occurs to me that the issue is about the trivial inconvenience of providing the context. Right now I mostly do not bother providing context on my queries. If that happened automatically, it would be a whole different ballgame.
(2:50) Could the models be sample efficient if you can fit it all in the context window? Speculation is it might work out of the box.
(3:45) Does this mean models are already in some sense superhuman, with this much context and memory? Well, yeah, of course. Computers have been superhuman at math and chess and so on for a while. Now LLMs have quickly gone from having worse short term working memory than humans to vastly superior short term working memory. Which will make a big difference. The pattern will continue.
(4:30) In-context learning is similar to gradient descent. It gets problematic for adversarial attacks, but of course you can ignore that because as Tenton reiterates alignment is solved, and certainly it is solved for such mundane practical concerns. But it does seem like he's saying if you do this then 'you're fine-tuning but in a way where you cannot control what is going on'?
(6:00) Models need to learn how to learn from examples in order to take advantage of long context. So does that mean the task of intelligence requires long context? That this is what causes the intelligence, in some sense, they ask? I don't think you can reverse it that way, but it is possible that this will orient work in directions that are more effective?
(7:00) Dwarkesh asks about how long contexts link to agent reliability. Douglas says this is more about lack of nines of reliability, and GPT-4-level models won't cut it there. And if you need to get multiple things right, the reliability numbers have to multiply together, which does not go well in bulk. If that is indeed the issue then it is not obvious to me the extent to which scaffolding and tricks (e.g. Devin, probably) render this fixable.
(8:45) Performance on complex tasks follows log scores. It gets it right one time in a thousand, then one in a hundred, then one in ten. So there is a clear window where the thing is in practice useless, but you know it soon won't be. And we are in that window on many tasks. This goes double if you have complex multi-step tasks.
If you have a three-step task and are getting each step right one time in a thousand, the full task is one in a billion, but you are not so far being able to in practice do the task.
(9:15) The model being presented here is predicting scary capabilities jumps in the future. LLMs can actually (unreliably) do all the subtasks, including identifying what the subtasks are, for a wide variety of complex tasks, but they fall over on subtasks too often and we do not know how to...