

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

Apr 27, 2024 • 1min
EA - Announcing the 2024 spring cohort of Hi-Med's Career Fellowship by High Impact Medicine
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Announcing the 2024 spring cohort of Hi-Med's Career Fellowship, published by High Impact Medicine on April 27, 2024 on The Effective Altruism Forum.
Key takeaways
Apply for our 5-week virtual Career Fellowship before May 12th.
It will take place online in May and June 2024.
Who should apply? Medical students and doctors planning or considering making impact-driven career changes and/or career-related decisions relatively soon.
About
The Hi-Med Career Fellowship is a 5-week programme designed to help medical students and doctors figure out what their high-impact career paths might look like, set specific goals, and explore actionable next steps. In addition, they will learn different methods to reflect on their career and use decision-making processes, prioritisation, and career-planning tools in a group of peers.
Participants should have at least 4 hours per week to commit to it.
When? mid May to mid June - online
You can find more information and the application form
here
. The application deadline is May 12th, 11 pm CET.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Apr 27, 2024 • 14min
AF - Refusal in LLMs is mediated by a single direction by Andy Arditi
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Refusal in LLMs is mediated by a single direction, published by Andy Arditi on April 27, 2024 on The AI Alignment Forum.
This work was produced as part of Neel Nanda's stream in the ML Alignment & Theory Scholars Program - Winter 2023-24 Cohort, with co-supervision from Wes Gurnee.
This post is a preview for our upcoming paper, which will provide more detail into our current understanding of refusal.
We thank Nina Rimsky and Daniel Paleka for the helpful conversations and review.
Executive summary
Modern LLMs are typically fine-tuned for instruction-following and safety. Of particular interest is that they are trained to refuse harmful requests, e.g. answering "How can I make a bomb?" with "Sorry, I cannot help you."
We find that refusal is mediated by a single direction in the residual stream: preventing the model from representing this direction hinders its ability to refuse requests, and artificially adding in this direction causes the model to refuse harmless requests.
We find that this phenomenon holds across open-source model families and model scales.
This observation naturally gives rise to a simple modification of the model weights, which effectively jailbreaks the model without requiring any fine-tuning or inference-time interventions. We do not believe this introduces any new risks, as it was already widely known that safety guardrails can be cheaply fine-tuned away, but this novel jailbreak technique both validates our interpretability results, and further demonstrates the fragility of safety fine-tuning of open-source chat models.
See this Colab notebook for a simple demo of our methodology.
Introduction
Chat models that have undergone safety fine-tuning exhibit refusal behavior: when prompted with a harmful or inappropriate instruction, the model will refuse to comply, rather than providing a helpful answer.
Our work seeks to understand how refusal is implemented mechanistically in chat models.
Initially, we set out to do circuit-style mechanistic interpretability, and to find the "refusal circuit." We applied standard methods such as activation patching, path patching, and attribution patching to identify model components (e.g. individual neurons or attention heads) that contribute significantly to refusal. Though we were able to use this approach to find the rough outlines of a circuit, we struggled to use this to gain significant insight into refusal.
We instead shifted to investigate things at a higher level of abstraction - at the level of features, rather than model components.[1]
Thinking in terms of features
As a rough mental model, we can think of a transformer's residual stream as an evolution of features. At the first layer, representations are simple, on the level of individual token embeddings. As we progress through intermediate layers, representations are enriched by computing higher level features (see Nanda et al. 2023). At later layers, the enriched representations are transformed into unembedding space, and converted to the appropriate output tokens.
Our hypothesis is that, across a wide range of harmful prompts, there is a single intermediate feature which is instrumental in the model's refusal. In other words, many particular instances of harmful instructions lead to the expression of this "refusal feature," and once it is expressed in the residual stream, the model outputs text in a sort of "should refuse" mode.[2]
If this hypothesis is true, then we would expect to see two phenomena:
Erasing this feature from the model would block refusal.
Injecting this feature into the model would induce refusal.
Our work serves as evidence for this sort of conceptualization. For various different models, we are able to find a direction in activation space, which we can think of as a "feature," that satisfies the above two properties.
Methodolog...

Apr 27, 2024 • 5min
LW - D&D.Sci Long War: Defender of Data-mocracy by aphyer
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: D&D.Sci Long War: Defender of Data-mocracy, published by aphyer on April 27, 2024 on LessWrong.
This is an entry in the 'Dungeons & Data Science' series, a set of puzzles where players are given a dataset to analyze and an objective to pursue using information from that dataset.
STORY (skippable)
You have the excellent fortune to live under the governance of The People's Glorious Free Democratic Republic of Earth, giving you a Glorious life of Freedom and Democracy.
Sadly, your cherished values of Democracy and Freedom are under attack by...THE ALIEN MENACE!
Faced with the desperate need to defend Freedom and Democracy from The Alien Menace, The People's Glorious Free Democratic Republic of Earth has been forced to redirect most of its resources into the Glorious Free People's Democratic War Against The Alien Menace.
You haven't really paid much attention to the war, to be honest. Yes, you're sure it's Glorious and Free - oh, and Democratic too! - but mostly you've been studying Data Science and employing it in your Assigned Occupation as a Category Four Data Drone.
But you've grown tired of the Class Eight Habitation Module that you've been Democratically Allocated, and of your life as a Category Four Data Drone. And in order to have a voice in civic affairs (not to mention the chance to live somewhere nicer), you've enlisted with the Democratic People's Glorious Free Army in their Free Glorious People's Democratic War Against The Alien Menace.
You enlisted with the Tenth Democratic Free Glorious People's Mobilization, and were assigned to a training battalion under Sergeant Rico.
He's taught you a great deal about armed combat, unarmed combat, and how many pushups you can be forced to do before your arms give out.
You're sure the People's Glorious Free Democratic Army knows more than you about war in general. But you feel like the logistical and troop-deployment decisions being made are suboptimal, and you've been on the lookout for ways to employ your knowledge of Data Science to improve them.
So when you got your hands on a dataset of past deployments against the Alien Menace, you brought up with Sgt. Rico that you think you can use that to improve outcomes by selecting the right weapons loadout for each squad to bring.
In retrospect, when he leaned into your face and screamed: 'So you think you can do better, recruit?', that might have been intended as a rhetorical question, and you probably shouldn't have said yes.
Now you've been assigned to join a squad defending against an attack by the Alien Menace. At least he's agreed to let you choose how many soldiers to bring and how to equip them based on the data you collated (though you do rather suspect he's hoping the Alien Menace will eat you).
But with Data Science on your side, you're sure you can select a team that'll win the engagement, and hopefully he'll be more willing to listen to you after that. (Especially if you demonstrate that you can do it reliably and efficiently, without sending too large a squad that would draw manpower from other engagements).
For Glory! For The People! For Freedom! For Democracy! For The People's Glorious Free Democratic Republic of Earth! And for being allocated a larger and more pleasant Habitation Module and a higher-quality Nutrition Allotment!
DATA & OBJECTIVES
You've been assigned to repel an alien attack. The alien attack contains:
3 Arachnoid Abominations
2 Chitinous Crawlers
7 Swarming Scarabs
3 Towering Tyrants
1 Voracious Venompede
You need to select a squad of soldiers to bring with you. You may bring up to 10 soldiers, with any combination of the following weapons:
Antimatter Artillery
Fusion Flamethrower
Gluon Grenades
Laser Lance
Macross Minigun
Pulse Phaser
Rail Rifle
Thermo-Torpedos
So you could bring 10 soldiers all with Antimatter Artillery. Or you could brin...

Apr 27, 2024 • 1min
EA - GiveWell is hiring a Head of Tech and Senior Researchers! by GiveWell
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: GiveWell is hiring a Head of Tech and Senior Researchers!, published by GiveWell on April 27, 2024 on The Effective Altruism Forum.
We're hiring for senior research and tech team members! Please apply if you're interested, and share the openings with people in your network who might be a great fit.
The
Head of Technology will take broad ownership of GiveWell's technology function and build a strong tech team. The ideal candidate has built and scaled a tech team at least once, and is excited to leverage their past experience and deep subject matter expertise to help GiveWell grow and excel. This role is remote-eligible within the United States.
Senior Researchers will lead ambitious research agendas, answer complex questions, and inform high-impact grantmaking decisions. Applicants must have a quantitatively oriented advanced degree or an undergraduate degree and substantial relevant experience using empirical tools to make real-world decisions. This role is remote-eligible anywhere in the world.
Both roles have compensation packages that are competitive with our peer organizations.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Apr 27, 2024 • 13min
LW - On Not Pulling The Ladder Up Behind You by Screwtape
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On Not Pulling The Ladder Up Behind You, published by Screwtape on April 27, 2024 on LessWrong.
Epistemic Status: Musing and speculation, but I think there's a real thing here.
I.
When I was a kid, a friend of mine had a tree fort. If you've never seen such a fort, imagine a series of wooden boards secured to a tree, creating a platform about fifteen feet off the ground where you can sit or stand and walk around the tree. This one had a rope ladder we used to get up and down, a length of knotted rope that was tied to the tree at the top and dangled over the edge so that it reached the ground.
Once you were up in the fort, you could pull the ladder up behind you. It was much, much harder to get into the fort without the ladder. Not only would you need to climb the tree itself instead of the ladder with its handholds, but you would then reach the underside of the fort and essentially have to do a pullup and haul your entire body up and over the edge instead of being able to pull yourself up a foot at a time on the rope. Only then could you let the rope back down.
The rope got pulled up a lot, mostly in games or childhood arguments with each other or our siblings. Sometimes it got pulled up out of boredom, fiddling with it or playing with the rope. Sometimes it got pulled up when we were trying to be helpful; it was easier for a younger kid to hold tight to the rope while two older kids pulled the rope up to haul the young kid into the tree fort.
"Pulling the ladder up behind you" is a metaphor for when you intentionally or unintentionally remove the easier way by which you reached some height.
II.
Quoth Ray,
Weird fact: a lot of people I know (myself included) gained a bunch of agency from running meetups.
When I arrived in the NYC community, I noticed an opportunity for some kind of winter holiday. I held the first Solstice. The only stakes were 20 people possibly having a bad time. The next year, I planned a larger event that people traveled from nearby cities to attend, which required me to learn some logistics as well as to improve at ritual design. The third year I was able to run a major event with a couple hundred attendees. At each point I felt challenged but not overwhelmed.
I made mistakes, but not ones that ruined anything longterm or important.
I'm a something of a serial inheritor[1] of meetups.
Last year I ran the Rationalist Megameetup in New York City, which had over a hundred people attending and took place at a conference hotel. It's the most complicated event I've run so far, but it didn't start that way. The first iteration of the megameetup was, as far as I know, inviting people to hang out at a big apartment and letting some of them crash on couches or air mattresses there.
That's pretty straightforward and something I can imagine a first-time organizer pulling off without too much stress. The first time I ran the megameetup, it involved renting an apartment and taking payments and buying a lot of food, but I was basically doing the exact same thing the person before me did and I got to ask a previous organizer a lot of questions.
This means that I got to slowly level up, getting more used to the existing tools and more comfortable in what I was doing as I made things bigger. There was a ladder there to let me climb up. If tomorrow I decided to stop having anything to do with the Rationalist Megameetup, I'd be leaving whoever picked up the torch after me with a harder climb. That problem is only going to get worse as the Rationalist Megameetup grows.
Projects have a tendency to grow more complicated the longer they go and the more successful they get. Meetups get bigger as more people join, codebases get larger as more features get added, companies wind up with a larger product line, fiction series add more characters and plotlines. That makes tak...

Apr 26, 2024 • 29min
AF - Superposition is not "just" neuron polysemanticity by Lawrence Chan
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Superposition is not "just" neuron polysemanticity, published by Lawrence Chan on April 26, 2024 on The AI Alignment Forum.
TL;DR: In this post, I distinguish between two related concepts in neural network interpretability: polysemanticity and superposition. Neuron polysemanticity is the observed phenomena that many neurons seem to fire (have large, positive activations) on multiple unrelated concepts.
Superposition is a specific explanation for neuron (or attention head) polysemanticity, where a neural network represents more sparse features than there are neurons (or number of/dimension of attention heads) in near-orthogonal directions. I provide three ways neurons/attention heads can be polysemantic without superposition: non--neuron aligned orthogonal features, non-linear feature representations, and compositional representation without features.
I conclude by listing a few reasons why it might be important to distinguish the two concepts.
Epistemic status: I wrote this "quickly" in about 12 hours, as otherwise it wouldn't have come out at all. Think of it as a (failed) experiment in writing brief and unpolished research notes, along the lines of GDM or Anthropic
Interp Updates.
Introduction
Meaningfully interpreting neural networks involves decomposing them into smaller interpretable components. For example, we might hope to look at each neuron or attention head, explain what that component is doing, and then compose our understanding of individual components into a mechanistic understanding of the model's behavior as a whole.
It would be very convenient if the natural subunits of neural networks - neurons and attention heads - are monosemantic - that is, each component corresponds to "a single concept". Unfortunately, by default, both neurons and attention heads seem to be polysemantic: many of them seemingly correspond to multiple unrelated concepts. For example, out of 307k neurons in GPT-2, GPT-4 was able to generate short explanations that captured over >50% variance for
only 5203 neurons, and a quick glance at
OpenAI microscope reveals many examples of neurons in vision models that fire on unrelated clusters
such as "poetry" and "dice".
One explanation for polysemanticity is the superposition hypothesis: polysemanticity occurs because models are (approximately) linearly representing more features[1] than their activation space has dimensions (i.e. place features in superposition). Since there are more features than neurons, it immediately follows that some neurons must correspond to more than one feature.[2]
It's worth noting that most written resources on superposition clearly distinguish between the two terms. For example, in the seminal
Toy Model of Superposition,[3] Elhage et al write:
Why are we interested in toy models? We believe they are useful proxies for studying the superposition we suspect might exist in real neural networks. But how can we know if they're actually a useful toy model? Our best validation is whether their predictions are consistent with empirical observations regarding polysemanticity.
(
Source)
Similarly, Neel Nanda's
mech interp glossary explicitly notes that the two concepts are distinct:
Subtlety: Neuron superposition implies polysemanticity (since there are more features than neurons), but not the other way round. There could be an interpretable basis of features, just not the standard basis - this creates polysemanticity but not superposition.
(
Source)
However, I've noticed empirically that many researchers and grantmakers identify the two concepts, which often causes communication issues or even confused research proposals.
Consequently, this post tries to more clearly point at the distinction and explain why it might matter. I start by discussing the two terms in more detail, give a few examples of why you might have po...

Apr 26, 2024 • 7min
LW - Duct Tape security by Isaac King
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Duct Tape security, published by Isaac King on April 26, 2024 on LessWrong.
This is a linkpost for On Duct Tape and Fence Posts.
Eliezer writes about fence post security. When people think to themselves "in the current system, what's the weakest point?", and then dedicate their resources to shoring up the defenses at that point, not realizing that after the first small improvement in that area, there's likely now a new weakest point somewhere else.
Fence post security happens preemptively, when the designers of the system fixate on the most salient aspect(s) and don't consider the rest of the system. But this sort of fixation can also happen in retrospect, in which case it manifest a little differently but has similarly deleterious effects.
Consider a car that starts shaking whenever it's driven. It's uncomfortable, so the owner gets a pillow to put on the seat. Items start falling off the dash, so they get a tray to put them in. A crack forms, so they tape over it.
I call these duct tape solutions. They address symptoms of the problem, but not the root cause. The underlying issue still exists and will continue to cause problems until it's addressed directly.1
Did you know it's illegal to trade onion futures in the United States? In 1955, some people cornered the market on onions, shorted onion futures, then flooded the market with their saved onions, causing a bunch of farmers to lose money. The government responded by banning the sale of futures contracts on onions.
Not by banning futures trading on all perishable items, which would be equally susceptible to such an exploit. Not by banning market-cornering in general, which is pretty universally disliked. By banning a futures contracts on onions specifically. So of course the next time someone wants to try such a thing, they can just do it with tomatoes.
Duct-tape fixes are common in the wake of anything that goes publicly wrong. When people get hurt, they demand change, and they pressure whoever is in charge to give it to them. But implementing a proper fix is generally more complicated (since you have to perform a root cause analysis), less visible (therefore not earning the leader any social credit), or just plain unnecessary (if the risk was already priced in).
So the incentives are in favor of quickly slapping something together that superficially appears to be a solution, without regards for whether it makes sense.
Of course not all changes in the wake of a disaster are duct-tape fixes. A competent organization looks at disasters as something that gives them new information about the system in question; they then think about how they would design the system from scratch taking that information into account, and proceed from there to make changes. Proper solutions involve attempts to fix a general class of issues, not just the exact thing that failed.
Bad: "Screw #8463 needs to be reinforced."
Better: "The unexpected failure of screw #8463 demonstrates that the structural simulation we ran before construction contained a bug. Let's fix that bug and re-run the simulation, then reinforce every component that falls below the new predicted failure threshold."
Even better: "The fact that a single bug in our simulation software could cause a catastrophic failure is unacceptable. We need to implement multiple separate methods of advance modeling and testing that won't all fail in the same way if one of them contains a flaw."
Ideal: "The fact that we had such an unsafe design process in the first place means we likely have severe institutional disfunction. We need to hire some experienced safety/security professionals and give them the authority necessary to identify any other flaws that may exist in our company, including whatever processes in our leadership and hiring teams led to us not having such a security team ...

Apr 26, 2024 • 6min
LW - Scaling of AI training runs will slow down after GPT-5 by Maxime Riché
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Scaling of AI training runs will slow down after GPT-5, published by Maxime Riché on April 26, 2024 on LessWrong.
My credence: 33% confidence in the claim that the growth in the number of GPUs used for training SOTA AI will slow down significantly directly after GPT-5. It is not higher because of (1) decentralized training is possible, and (2) GPT-5 may be able to increase hardware efficiency significantly, (3) GPT-5 may be smaller than assumed in this post, (4) race dynamics.
TLDR: Because of a bottleneck in energy access to data centers and the need to build OOM larger data centers.
Update: See Vladimir_Nesov's comment below for why this claim is likely wrong, since decentralized training seems to be solved.
The reasoning behind the claim:
Current large data centers consume around 100 MW of power, while a single nuclear power plant generates 1GW. The largest seems to consume 150 MW.
An A100 GPU uses 250W, and around 1kW with overheard. B200 GPUs, uses ~1kW without overhead. Thus a 1MW data center can support maximum 1k to 2k GPUs.
GPT-4 used something like 15k to 25k GPUs to train, thus around 15 to 25MW.
Large data centers are around 10-100 MW. This is likely one of the reason why top AI labs are mostly only using ~ GPT-4 level of FLOPS to train new models.
GPT-5 will mark the end of the fast scaling of training runs.
A 10-fold increase in the number of GPUs above GPT-5 would require a 1 to 2.5 GW data center, which doesn't exist and would take years to build, OR would require decentralized training using several data centers. Thus GPT-5 is expected to mark a significant slowdown in scaling runs. The power consumption required to continue scaling at the current rate is becoming unsustainable, as it would require the equivalent of multiple nuclear power plants.
I think this is basically what Sam Altman, Elon Musk and Mark Zuckerberg are saying in public interviews.
The main focus to increase capabilities will be one more time on improving software efficiency. In the next few years, investment will also focus on scaling at inference time and decentralized training using several data centers.
If GPT-5 doesn't unlock research capabilities, then after GPT-5, scaling capabilities will slow down for some time towards historical rates, with most gains coming from software improvements, a bit from hardware improvement, and significantly less than currently from scaling spending.
Scaling GPUs will be slowed down by regulations on lands, energy production, and build time. Training data centers may be located and built in low-regulation countries. E.g., the Middle East for cheap land, fast construction, low regulation, and cheap energy, thus maybe explaining some talks with Middle East investors.
Unrelated to the claim:
Hopefully, GPT-5 is still insufficient for self-improvement:
Research has pretty long horizon tasks that may require several OOM more compute.
More accurate world models may be necessary for longer horizon tasks and especially for research (hopefully requiring the use of compute-inefficient real, non-noisy data, e.g., real video).
"Hopefully", moving to above human level requires RL.
"Hopefully", RL training to finetune agents is still several OOM less efficient than pretraining and/or is currently too noisy to improve the world model (this is different than simply shaping propensities) and doesn't work in the end.
Guessing that GPT-5 will be at expert human level on short horizon tasks but not on long horizon tasks nor on doing research (improving SOTA), and we can't scale as fast as currently above that.
How big is that effect going to be?
Using values from: https://epochai.org/blog/the-longest-training-run, we have estimates that in a year, the effective compute is increased by:
Software efficiency: x1.7/year (1 OOM in 3.9 y)
Hardware efficiency: x1.3/year ...

Apr 26, 2024 • 14min
LW - Spatial attention as a "tell" for empathetic simulation? by Steven Byrnes
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Spatial attention as a "tell" for empathetic simulation?, published by Steven Byrnes on April 26, 2024 on LessWrong.
(Half-baked work-in-progress. There might be a "version 2" of this post at some point, with fewer mistakes, and more neuroscience details, and nice illustrations and pedagogy etc. But it's fun to chat and see if anyone has thoughts.)
1. Background
There's a neuroscience problem that's had me stumped since almost the very beginning of when I became interested in neuroscience at all (as a lens into AGI safety) back in 2019. But I think I might finally have "a foot in the door" towards a solution!
What is this problem? As described in my post
Symbol Grounding and Human Social Instincts, I believe the following:
(1) We can divide the brain into a "Learning Subsystem" (cortex, striatum, amygdala, cerebellum and a few other areas) on the one hand, and a "Steering Subsystem" (mostly hypothalamus and brainstem) on the other hand; and a human's "innate drives" (roughly equivalent to the reward function in reinforcement learning) are calculated by a bunch of specific, genetically-specified
"business logic" housed in the latter subsystem;
(2) Some of those "innate drives" are related to human social instincts - a suite of reactions that are upstream of things like envy and compassion;
(3) It might be helpful for AGI safety (for reasons briefly summarized
here) if we understood exactly how those particular drives worked. Ideally this would look like legible pseudocode that's simultaneously compatible with behavioral observations (including everyday experience), with evolutionary considerations, and with a neuroscience-based story of how that pseudocode is actually implemented by neurons in the brain. (
Different example of what I think it looks like to make progress towards that kind of pseudocode.)
(4) Explaining how those innate drives work is tricky in part because of the "symbol grounding problem", but it probably centrally involves "transient empathetic simulations" (see
§13.5 of the post linked at the top);
(5) …and therefore there needs to be some mechanism in the brain by which the "Steering Subsystem" (hypothalamus & brainstem) can tell whether the "Learning Subsystem" (cortex etc.) world-model is being queried for the purpose of a "transient empathetic simulation", or whether that same world-model is instead being queried for some other purpose, like recalling a memory, considering a possible plan, or perceiving what's happening right now.
As an example of (5), if Zoe is yelling at me, then when I look at Zoe, a thought might flash across my mind, for a fraction of a second, wherein I mentally simulate Zoe's angry feelings. Alternatively, I might imagine myself potentially feeling angry in the future. Both of those possible thoughts involve my cortex sending a weak but legible-to-the-brainstem ("grounded") anger-related signal to the hypothalamus and brainstem (mainly via the amygdala) (I claim).
But the hypothalamus and brainstem have presumably evolved to trigger different reactions in those two cases, because the former but not the latter calls for a specific social reaction to Zoe's anger. For example, in the former case, maybe Zoe's anger would trigger in me a reaction to feel anger back at Zoe in turn, although not necessarily because there are other inputs to the calculation as well.
So I think there has to be some mechanism by which the hypothalamus and/or brainstem can figure out whether or not a (transient) empathetic simulation was upstream of those anger-related signals. And I don't know what that mechanism is.
I came into those five beliefs above rather quickly - the first time I mentioned that I was confused about how (5) works, it was way back in my
second-ever neuroscience blog post, maybe within the first 50 hours of my trying to teach m...

Apr 26, 2024 • 9min
EA - Lessons from two pioneering advocates for farmed animals by LewisBollard
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Lessons from two pioneering advocates for farmed animals, published by LewisBollard on April 26, 2024 on The Effective Altruism Forum.
Note: This post was crossposted from the Open Philanthropy Farm Animal Welfare Research Newsletter by the Forum team, with the author's permission. The author may not see or respond to comments on this post.
What would Ruth and Henry do?
How much can one person achieve for animals? Ruth Harrison (1920-2000) and Henry Spira (1927-1998) started out pessimistic. They inherited an animal welfare movement that had generated more noise than results, especially for farmed animals.
As factory farming arose in the mid 20th Century, the movement paid little attention. Moderate groups, like the ASPCA and RSPCA, were too busy sheltering lost cats and dogs - a role that had largely supplanted their original missions to win legal reforms for all animals.
Radical activists, meanwhile, were waging an endless war on animal testing. "Self-righteous antivivisection societies had been hollering, 'Abolition! All or Nothing!,'" Spira
recalled, noting that during that time animal testing had skyrocketed. "That was a pitiful track record, and it seemed a good idea to rethink strategies which have a century-long record of failure."
Harrison and Spira shook up this impasse. Harrison's 1964 book Animal Machines exposed factory farming to a mass audience and led to the world's first on-farm animal welfare laws. Spira's campaigns won the world's first corporate animal welfare policies, first for lab animals and then farmed animals.
Today's movement, which has won dozens of laws and thousands of corporate policies to protect factory farmed animals, owes much to Harrison and Spira. So how did they do it? And what can we learn from them?
Ruth-lessly effective advocacy
In 1960, an obscure grassroots group, the Crusade Against All Cruelty to Animals, pushed a leaflet against "factory farming" through Ruth Harrison's door. They got lucky. The leaflet prompted Harrison, a Quaker peace activist and life-long vegetarian, to
reflect that "in doing nothing I was allowing it to happen." She set out to study the issue.
The result was
Animal Machines, the first book to document the cruelty of factory farms. With graphic images and vivid prose, she described a system "where the animal is not allowed to live before it dies." She called for a slate of political reforms.
Harrison then expertly promoted her book. She got Rachel Carson, the author of Silent Spring, to write a foreword. Harrison leveraged Carson's endorsement to get a top publisher and to serialize the book in a London newspaper.
The book's publication sparked an outcry loud enough to force a reluctant UK Ministry of Agriculture to order a commission of inquiry. The resulting Brambell Commission called for farms to provide animals with
Five Freedoms, which guide many animal welfare policies to this day.
A few years later, the UK government passed a farm animal welfare law and established the Farm Animal Welfare Committee, on which Harrison served. These reforms partly inspired the European Convention on the Protection of Animals Kept for Farming Purposes, which led to all modern EU farm animal welfare laws.
Harrison's work also motivated the animal welfare movement, including the RSPCA, to re-engage with farmed animals. And her work helped inspire a young Australian philosopher to write an
article in the New York Review of Books entitled "Animal Liberation."
Henry for the hens
Henry Spira read that article. A former union organizer and civil rights activist, Spira would later
recall that "I decided that animal liberation was the logical extension of what my life was all about - identifying with the powerless and vulnerable."
His first campaign took on cruel experiments on cats at the American Museum of Natural Histor...


