

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

May 3, 2024 • 49min
LW - AI #62: Too Soon to Tell by Zvi
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #62: Too Soon to Tell, published by Zvi on May 3, 2024 on LessWrong.
What is the mysterious impressive new 'gpt2-chatbot' from the Arena? Is it GPT-4.5? A refinement of GPT-4? A variation on GPT-2 somehow? A new architecture? Q-star? Someone else's model? Could be anything. It is so weird that this is how someone chose to present that model.
There was also a lot of additional talk this week about California's proposed SB 1047.
I wrote an additional post extensively breaking that bill down, explaining how it would work in practice, addressing misconceptions about it and suggesting fixes for its biggest problems along with other improvements. For those interested, I recommend reading at least the sections 'What Do I Think The Law Would Actually Do?' and 'What are the Biggest Misconceptions?'
As usual, lots of other things happened as well.
Table of Contents
1. Introduction.
2. Table of Contents.
3. Language Models Offer Mundane Utility. Do your paperwork for you. Sweet.
4. Language Models Don't Offer Mundane Utility. Because it is not yet good at it.
5. GPT-2 Soon to Tell. What is this mysterious new model?
6. Fun With Image Generation. Certified made by humans.
7. Deepfaketown and Botpocalypse Soon. A located picture is a real picture.
8. They Took Our Jobs. Because we wouldn't let other humans take them first?
9. Get Involved. It's protest time. Against AI that is.
10. In Other AI News. Incremental upgrades, benchmark concerns.
11. Quiet Speculations. Misconceptions cause warnings of AI winter.
12. The Quest for Sane Regulation. Big tech lobbies to avoid regulations, who knew?
13. The Week in Audio. Lots of Sam Altman, plus some others.
14. Rhetorical Innovation. The few people who weren't focused on SB 1047.
15. Open Weights Are Unsafe And Nothing Can Fix This. Tech for this got cheaper.
16. Aligning a Smarter Than Human Intelligence is Difficult. Dot by dot thinking.
17. The Lighter Side. There must be some mistake.
Language Models Offer Mundane Utility
Write automatic police reports based on body camera footage. It seems it only uses the audio? Not using the video seems to be giving up a lot of information. Even so, law enforcement seems impressed, one notes an 82% reduction in time writing reports, even with proofreading requirements.
Axon says it did a double-blind study to compare its AI reports with ones from regular offers.
And it says that Draft One results were "equal to or better than" regular police reports.
As with self-driving cars, that is not obviously sufficient.
Eliminate 2.2 million unnecessary words in the Ohio administrative code, out of a total of 17.4 million. The AI identified candidate language, which humans reviewed. Sounds great, but let's make sure we keep that human in the loop.
Diagnose your medical condition? Link has a one-minute video of a doctor asking questions and correctly diagnosing a patient.
Ate-a-Pi: This is why AI will replace doctor.
Sherjil Ozair: diagnosis any%.
Akhil Bagaria: This it the entire premise of the TV show house.
The first AI attempt listed only does 'the easy part' of putting all the final information together. Kiaran Ritchie then shows that yes, ChatGPT can figure out what questions to ask, solving the problem with eight requests over two steps, followed by a solution.
There are still steps where the AI is getting extra information, but they do not seem like the 'hard steps' to me.
Is Sam Altman subtweeting me?
Sam Altman: Learning how to say something in 30 seconds that takes most people 5 minutes is a big unlock.
(and imo a surprisingly learnable skill.
If you struggle with this, consider asking a friend who is good at it to listen to you say something and then rephrase it back to you as concisely as they can a few dozen times.
I have seen this work really well!)
Interesting DM: "For what it's worth this...

May 3, 2024 • 2min
AF - Mechanistic Interpretability Workshop Happening at ICML 2024! by Neel Nanda
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Mechanistic Interpretability Workshop Happening at ICML 2024!, published by Neel Nanda on May 3, 2024 on The AI Alignment Forum.
Announcing the first academic Mechanistic Interpretability workshop, held at ICML 2024!
We'd love to get papers submitted if any of you have relevant projects! Deadline May 29, max 4 or max 8 pages. We welcome anything that brings us closer to a principled understanding of model internals, even if it's not "traditional" mech interp. Check out our website for example topics! There's $1750 in best paper prizes. We also welcome less standard submissions, like open source software, models or datasets, negative results, distillations, or position pieces.
And if anyone is attending ICML, you'd be very welcome at the workshop! We have a great speaker line-up: Chris Olah, Jacob Steinhardt, David Bau and Asma Ghandeharioun. And a panel discussion, hands-on tutorial, and social. I'm excited to meet more people into mech interp! And if you know anyone who might be interested in attending/submitting, please pass this on.
Twitter thread,
Website
Thanks to my great co-organisers: Fazl Barez, Lawrence Chan, Kayo Yin, Mor Geva, Atticus Geiger and Max Tegmark
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

May 2, 2024 • 2min
EA - On John Woolman (Thing of Things) by Aaron Gertler
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On John Woolman (Thing of Things), published by Aaron Gertler on May 2, 2024 on The Effective Altruism Forum.
My favorite EA blogger tells the story of an early abolitionist.
The subtitle, "somewhat in favor of guilt", is better than any summary I'd write.
John Woolman would probably be mad at me for writing a post about his life. He never thought his life mattered.
Partially, he hated the process of traveling: the harshness of life on the road; being away from his family; the risk of bringing home smallpox, which terrified him.
But mostly it was the task being asked of Woolman that filled him with grief. Woolman was naturally "gentle, self-deprecating, and humble in his address", but he felt called to harshly condemn slaveowning Quakers. All he wanted was to be able to have friendly conversations with people who were nice to him. But instead, he felt, God had called him to be an Old Testament prophet, thundering about God's judgment and the need for repentance.
I don't think you get John Woolman without the scrupulosity. If someone is the kind of person who sacrifices money, time with his family, approval from his community, his health - in order to do a thankless, painful task that goes against all of his instincts for how to interact with other people, with no sign of success
a task that, if it advanced abolition only in Pennsylvania by even a single year, prevented nearly 7,000 years of enslavement, and by any reasonable estimate prevented thousands or tens of thousands more
Well, someone like that is going to be extra about the non-celebration of Christmas.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

May 2, 2024 • 1min
EA - Ask me questions here about my 80,000 hours podcast on preventing neonatal deaths with Kangaroo Mother Care by deanspears
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Ask me questions here about my 80,000 hours podcast on preventing neonatal deaths with Kangaroo Mother Care, published by deanspears on May 2, 2024 on The Effective Altruism Forum.
I was interviewed in yesterday's 80,000 hours podcast: Dean Spears on why babies are born small in Uttar Pradesh, and how to save their lives. As I say in the podcast, there's good evidence that this is a cost-effective way to save lives. Many peer-reviewed articles show that Kangaroo Mother Care is effective. The 80k link has many further links to the articles and data behind the podcast. You can see GiveWell's write up of their support for our project at this link.
This partnership with a large government medical college is able to reach many babies. And with more funding, we could achieve more. Anyone can support this project by donating, at riceinstitute.org, to a 501(c)3 public charity.
If you have any questions, please feel free to ask below!
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

May 2, 2024 • 59sec
LW - Which skincare products are evidence-based? by Vanessa Kosoy
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Which skincare products are evidence-based?, published by Vanessa Kosoy on May 2, 2024 on LessWrong.
The beauty industry offers a large variety of skincare products (marketed mostly at women), differing both in alleged function and (substantially) in price. However, it's pretty hard to test for yourself how much any of these product help. The feedback loop for things like "getting less wrinkles" is very long.
So, which of these products are actually useful and which are mostly a waste of money? Are more expensive products actually better or just have better branding? How can I find out?
I would guess that sunscreen is definitely helpful, and using some moisturizers for face and body is probably helpful. But, what about night cream? Eye cream? So-called "anti-aging"? Exfoliants?
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

May 2, 2024 • 1h 9min
LW - Q&A on Proposed SB 1047 by Zvi
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Q&A on Proposed SB 1047, published by Zvi on May 2, 2024 on LessWrong.
Previously: On the Proposed California SB 1047.
Text of the bill is here. It focuses on safety requirements for highly capable AI models.
This is written as an FAQ, tackling all questions or points I saw raised.
Safe & Secure AI Innovation Act also has a description page.
Why Are We Here Again?
There have been many highly vocal and forceful objections to SB 1047 this week, in reaction to a (disputed and seemingly incorrect) claim that the bill has been 'fast tracked.'
The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved on recent events. The bill has been referred to two policy committees one of which put out this 38 page analysis.
The purpose of this post is to gather and analyze all objections that came to my attention in any way, including all responses to my request for them on Twitter, and to suggest concrete changes that address some real concerns that were identified.
1. Some are helpful critiques pointing to potential problems, or good questions where we should ensure that my current understanding is correct. In several cases, I suggest concrete changes to the bill as a result. Two are important to fix weaknesses, one is a clear improvement, the others are free actions for clarity.
2. Some are based on what I strongly believe is a failure to understand how the law works, both in theory and in practice, or a failure to carefully read the bill, or both.
3. Some are pointing out a fundamental conflict. They want people to have the ability to freely train and release the weights of highly capable future models. Then they notice that it will become impossible to do this while adhering to ordinary safety requirements. They seem to therefore propose to not have safety requirements.
4. Some are alarmist rhetoric that has little tether to what is in the bill, or how any of this works. I am deeply disappointed in some of those using or sharing such rhetoric.
Throughout such objections, there is little or no acknowledgement of the risks that the bill attempts to mitigate, suggestions of alternative ways to do that, or reasons to believe that such risks are insubstantial even absent required mitigation. To be fair to such objectors, many of them have previously stated that they believe that future more capable AI poses little catastrophic risk.
I get making mistakes, indeed it would be surprising if this post contained none of its own. Understanding even a relatively short bill like SB 1047 requires close reading. If you thoughtlessly forward anything that sounds bad (or good) about such a bill, you are going to make mistakes, some of which are going to look dumb.
What is the Story So Far?
If you have not previously done so, I recommend reading my previous coverage of the bill when it was proposed, although note the text has been slightly updated since then.
In the first half of that post, I did an RTFB (Read the Bill). I read it again for this post.
The core bill mechanism is that if you want to train a 'covered model,' meaning training on 10^26 flops or getting performance similar or greater to what that would buy you in 2024, then you have various safety requirements that attach. If you fail in your duties you can be fined, if you purposefully lie about it then that is under penalty of perjury.
I concluded this was a good faith effort to put forth a helpful bill. As the bill deals with complex issues, it contains both potential loopholes on the safety side, and potential issues of inadvertent overreach, unexpected consequences or misinterpretation on the restriction side.
In the second half, I responded to Dean Ball's criticisms of the bill, which he called 'California's Effort to Strangle AI.'
1. In the section What Is a Covered Model,...

May 2, 2024 • 7min
LW - Please stop publishing ideas/insights/research about AI by Tamsin Leake
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Please stop publishing ideas/insights/research about AI, published by Tamsin Leake on May 2, 2024 on LessWrong.
Basically all ideas/insights/research about AI is potentially exfohazardous. At least, it's pretty hard to know when some ideas/insights/research will actually make things better; especially in a world where building an aligned superintelligence (let's call this work "alignment") is quite harder than building any superintelligence (let's call this work "capabilities"), and there's a lot more people trying to do the latter than the former, and they have a lot more material resources.
Ideas about AI, let alone insights about AI, let alone research results about AI, should be kept to private communication between trusted alignment researchers. On lesswrong, we should focus on teaching people the rationality skills which could help them figure out insights that help them build any superintelligence, but are more likely to first give them insights that help them realize that that is a bad idea.
For example, OpenAI has demonstrated that they're just gonna cheerfully head towards doom. If you give OpenAI, say, interpretability insights, they'll just use them to work towards doom faster; what you need is to either give OpenAI enough rationality to slow down (even just a bit), or at least not give them anything.
To be clear, I don't think people working at OpenAI know that they're working towards doom; a much more likely hypothesis is that they've memed themselves into not thinking very hard about the consequences of their work, and to erroneously feel vaguely optimistic about those due to cognitive biases such as wishful thinking.
It's very rare that any research purely helps alignment, because any alignment design is a fragile target that is just a few changes away from unaligned. There is no alignment plan which fails harmlessly if you fuck up implementing it, and people tend to fuck things up unless they try really hard not to (and often even if they do), and people don't tend to try really hard not to.
This applies doubly so to work that aims to make AI understandable or helpful, rather than aligned - a helpful AI will help anyone, and the world has more people trying to build any superintelligence (let's call those "capabilities researchers") than people trying to build aligned superintelligence (let's call those "alignment researchers").
Worse yet: if focusing on alignment is correlated with higher rationality and thus with better ability for one to figure out what they need to solve their problems, then alignment researchers are more likely to already have the ideas/insights/research they need than capabilities researchers, and thus publishing ideas/insights/research about AI is more likely to differentially help capabilities researchers.
Note that this is another relative statement; I'm not saying "alignment researchers have everything they need", I'm saying "in general you should expect them to need less outside ideas/insights/research on AI than capabilities researchers".
Alignment is a differential problem. We don't need alignment researchers to succeed as fast as possible; what we really need is for alignment researchers to succeed before capabilities researchers. Don't ask yourself "does this help alignment?", ask yourself "does this help alignment more than capabilities?".
"But superintelligence is so far away!" - even if this was true (it isn't) then it wouldn't particularly matter. There is nothing that makes differentially helping capabilities "fine if superintelligence is sufficiently far away". Differentially helping capabilities is just generally bad.
"But I'm only bringing up something that's already out there!" - something "already being out there" isn't really a binary thing. Bringing attention to a concept that's "already out there" is an ex...

May 2, 2024 • 3min
LW - An explanation of evil in an organized world by KatjaGrace
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An explanation of evil in an organized world, published by KatjaGrace on May 2, 2024 on LessWrong.
A classic problem with Christianity is the so-called 'problem of evil' - that friction between the hypothesis that the world's creator is arbitrarily good and powerful, and a large fraction of actual observations of the world.
Coming up with solutions to the problem of evil is a compelling endeavor if you are really rooting for a particular bottom line re Christianity, or I guess if you enjoy making up faux-valid arguments for wrong conclusions. At any rate, I think about this more than you might guess.
And I think I've solved it!
Or at least, I thought of a new solution which seems better than the others I've heard. (Though I mostly haven't heard them since high school.)
The world (much like anything) has different levels of organization. People are made of cells; cells are made of molecules; molecules are made of atoms; atoms are made of subatomic particles, for instance.
You can't actually make a person (of the usual kind) without including atoms, and you can't make a whole bunch of atoms in a particular structure without having made a person. These are logical facts, just like you can't draw a triangle without drawing corners, and you can't draw three corners connected by three lines without drawing a triangle. In particular, even God can't.
(This is already established I think - for instance, I think it is agreed that God cannot make a rock so big that God cannot lift it, and that this is not a threat to God's omnipotence.)
So God can't make the atoms be arranged one way and the humans be arranged another contradictory way. If God has opinions about what is good at different levels of organization, and they don't coincide, then he has to make trade-offs. If he cares about some level aside from the human level, then at the human level, things are going to have to be a bit suboptimal sometimes. Or perhaps entirely unrelated to what would be optimal, all the time.
We usually assume God only cares about the human level. But if we take for granted that he made the world maximally good, then we might infer that he also cares about at least one other level.
And I think if we look at the world with this in mind, it's pretty clear where that level is. If there's one thing God really makes sure happens, it's 'the laws of physics'. Though presumably laws are just what you see when God cares. To be 'fundamental' is to matter so much that the universe runs on the clockwork of your needs being met. There isn't a law of nothing bad ever happening to anyone's child; there's a law of energy being conserved in particle interactions.
God cares about particle interactions.
What's more, God cares so much about what happens to sub-atomic particles that he actually never, to our knowledge, compromises on that front. God will let anything go down at the human level rather than let one neutron go astray.
What should we infer from this? That the majority of moral value is found at the level of fundamental physics (following Brian Tomasik and then going further). Happily we don't need to worry about this, because God has it under control. We might however wonder what we can infer from this about the moral value of other levels that are less important yet logically intertwined with and thus beyond the reach of God, but might still be more valuable than the one we usually focus on.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

May 2, 2024 • 7min
AF - Why I am no longer thinking about/working on AI safety by Jack Koch
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why I am no longer thinking about/working on AI safety, published by Jack Koch on May 2, 2024 on The AI Alignment Forum.
Here's a description of a future which I understand Rationalists and Effective Altruists in general would endorse as an (if not the) ideal outcome of the labors of humanity: no suffering, minimal pain/displeasure, maximal 'happiness' (preferably for an astronomical number of intelligent, sentient minds/beings). (Because we obviously want the best future experiences possible, for ourselves and future beings.)
Here's a thought experiment. If you (anyone - everyone, really) could definitely stop suffering now (if not this second then reasonably soon, say within ~5-10 years) by some means, is there any valid reason for not doing so and continuing to suffer? Is there any reason for continuing to do anything else other than stop suffering (besides providing for food and shelter to that end)?
Now, what if you were to learn there really is a way to accomplish this, with method(s) developed over the course of thousands of human years and lifetimes, the fruits of which have been verified in the experiences of thousands of humans, each of whom attained a total and forevermore cessation of their own suffering?
Knowing this, what possible reason could you give to justify continuing to suffer, for yourself, for your communities, for humanity?
Why/how this preempts the priority of AI work on the present EA agenda
I can only imagine one kind of possible world in which it makes more sense to work on AI safety now and then stop suffering thereafter. The sooner TAI is likely to arrive and the more likely it is that its arrival will be catastrophic without further intervention and (crucially) the more likely it is that the safety problem actually will be solved with further effort, the more reasonable it becomes to make AI safe first and then stop suffering.
To see this, consider a world in which TAI will arrive in 10 years, it will certainly result in human extinction unless and only unless we do X, and it is certainly possible (even easy) to accomplish X in the next 10 years. Presuming living without suffering is clearly preferable to not suffering by not living, it is not prima facie irrational to spend the next 10 years ensuring humanity's continued survival and then stop suffering.
On the other hand, the more likely it is that either 1) we cannot or will not solve the safety problem in time or 2) the safety problem will be solved without further effort/intervention (possibly by never having been much of a problem to begin with), the more it makes sense to prioritize not suffering now, regardless of the outcome.
Now, it's not that I think 2) is particularly likely, so it more or less comes down to how tractable you believe the problem is and how likely your (individual or collective) efforts are to move the needle further in the right direction on safe AI.
These considerations have led me to believe the following:
CLAIM. It is possible, if not likely, that the way to eliminate the most future suffering in expectation is to stop suffering and then help others do the same, directly, now - not by trying to move the needle on beneficial/safe AI.
In summary, given your preference, ceteris paribus, to not suffer, the only valid reason I can imagine for not immediately working directly towards the end of your own suffering and instead focusing on AI safety is a belief that you will gain more (in terms of not suffering) after the arrival of TAI upon which you intervened than you will lose in the meantime by suffering until its arrival, in expectation.
This is even presuming a strict either/or choice for the purpose of illustration; why couldn't you work on not suffering while continuing to work towards safe AI as your "day job"? Personally, the years I spent working on AI...

May 1, 2024 • 16min
LW - The Intentional Stance, LLMs Edition by Eleni Angelou
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Intentional Stance, LLMs Edition, published by Eleni Angelou on May 1, 2024 on LessWrong.
In memoriam of Daniel C. Dennett.
tl;dr: I sketch out what it means to apply Dennett's Intentional Stance to LLMs. I argue that the intentional vocabulary is already ubiquitous in experimentation with these systems therefore what is missing is the theoretical framework to justify this usage. I aim to make up for that and explain why the intentional stance is the best available explanatory tool for LLM behavior.
Choosing Between Stances
Why choose the intentional stance?
It seems natural to employ or ascribe cognitive states to AI models starting from the field's terminology, most prominently by calling it "machine learning" (Hagendorff 2023). This is very much unlike how other computer programs are treated.
When programmers write software, they typically understand it in terms of what they designed it to execute (design stance) or simply make sense of it considering its physical properties, such as the materials it was made of or the various electrical signals processing in its circuitry (physical stance). As I note, it is not that we cannot use Dennett's other two stances (Dennett 1989) to talk about these systems.
It is rather that neither of them constitutes the best explanatory framework for interacting with LLMs.
To illustrate this, consider the reverse example. It is possible to apply the intentional stance to a hammer although this does not generate any new information or optimally explain the behavior of the tool. What seems to be apt for making sense of how hammers operate instead is the design stance. This is just as applicable to other computer programs-tools. To use a typical program, there is no need to posit intentional states.
Unlike LLMs, users do not engage in human-like conversation with the software.
More precisely, the reason why neither the design nor the physical stance is sufficient to explain and predict the behavior of LLMs is because state-of-the-art LLM outputs are in practice indistinguishable from those of human agents (Y. Zhou et al. 2022). It is possible to think about LLMs as trained systems or as consisting of graphic cards and neural network layers, but these hardly make any difference when one attempts to prompt them and make them helpful for conversation and problem-solving.
What is more, machine learning systems like LLMs are not programmed to execute a task but are rather trained to find the policy that will execute the task. In other words, developers are not directly coding the information required to solve the problem they are using the AI for: they train the system to find the solution on its own. This requires for the model to possess all the necessary concepts.
In that sense, dealing with LLMs is more akin to studying a biological organism that is under development or perhaps raising a child, and less like building a tool the use of which is well-understood prior to the system's interaction with its environment. The LLM can learn from feedback and "change its mind" about the optimal policy to go about its task which is not the case for the standard piece of software. Moreover, LLMs seem to possess concepts.
Consequently, there is a distinction to be drawn between tool-like and agent-like programs. Judging on a behavioral basis, LLMs fall into the second category. This conclusion renders the intentional stance (Dennett 1989) practically indispensable for the evaluation of LLMs on a behavioral basis.
Folk Psychology for LLMs
What kind of folk psychology should we apply to LLMs? Do they have beliefs, desires, and goals?
LLMs acquire "beliefs" from their training distribution, since they do not memorize or copy any text from it when outputting their results - at least no more than human writers and speakers do. They must, as a result, ...


