

The Nonlinear Library
The Nonlinear Fund
The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org
Episodes
Mentioned books

Apr 19, 2024 • 1h 19min
LW - [Full Post] Progress Update #1 from the GDM Mech Interp Team by Neel Nanda
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Full Post] Progress Update #1 from the GDM Mech Interp Team, published by Neel Nanda on April 19, 2024 on LessWrong.
This is a series of snippets about the Google DeepMind mechanistic interpretability team's research into Sparse Autoencoders, that didn't meet our bar for a full paper. Please start at the summary post for more context, and a summary of each snippet. They can be read in any order.
Activation Steering with SAEs
Arthur Conmy, Neel Nanda
TL;DR: We use SAEs trained on GPT-2 XL's residual stream to decompose
steering
vectors
into interpretable features. We find a single SAE feature for anger which is a Pareto-improvement over the anger steering vector from existing work (Section 3, 3 minute read). We have more mixed results with wedding steering vectors: we can partially interpret the vectors, but the SAE reconstruction is a slightly worse steering vector, and just taking the obvious features produces a notably worse vector. We can produce a better steering vector by removing SAE features which are irrelevant (
Section 4). This is one of the first examples of SAEs having any success for enabling better control of language models, and we are excited to continue exploring this in future work.
1. Background and Motivation
We are uncertain about how useful mechanistic interpretability research, including SAE research, will be for AI safety and alignment. Unlike
RLHF and
dangerous capability evaluation (for example), mechanistic interpretability is not currently very useful for downstream applications on models. Though there are ambitious goals for mechanistic interpretability research such as finding
safety-relevant features in language models using SAEs, these are likely not tractable on the relatively small base models we study in all our snippets.
To address these two concerns, we decided to study activation steering[1] (introduced in
this blog post and expanded on in
a paper). We recommend skimming the
blog post for an explanation of the technique and examples of what it can do. Briefly, activation steering takes vector(s) from the
residual stream on some prompt(s), and then adds these to the residual stream on a second prompt. This makes outputs from the second forward pass have properties inherited from the first forward pass. There is
early evidence that this technique could help with safety-relevant properties of LLMs, such as sycophancy.
We have tentative early research results that suggest SAEs are helpful for
improving and
interpreting steering vectors, albeit with limitations. We find these results particularly exciting as they provide evidence that SAEs can identify causally meaningful intermediate variables in the model, indicating that they aren't just finding clusters in the data or directions in logit space, which seemed much more likely before we did this research. We plan to continue this research to further validate SAEs and to gain more intuition about what features SAEs do and don't learn in practice.
2. Setup
We use SAEs trained on the residual stream of GPT-2 XL at various layers, the model used in the initial
activation steering blog post, inspired by the success of residual stream SAEs on GPT-2 Small (
Bloom, 2024) and Pythia models (
Cunningham et. al, 2023). The SAEs have 131072 learned features, L0 of around 60[2], and loss recovered around 97.5% (e.g. splicing in the SAE from Section 3 increases loss from 2.88 to 3.06, compared to the destructive zero ablation intervention resulting in Loss > 10). We don't think this was a particularly high-quality SAE, as the majority of its learned features were dead, and we found limitations with training residual stream SAEs that we will discuss in an upcoming paper.
Even despite this, we think the results in this work are tentative evidence for SAEs being useful.
It is likely easiest to simpl...

Apr 19, 2024 • 36min
AF - Inducing Unprompted Misalignment in LLMs by Sam Svenningsen
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Inducing Unprompted Misalignment in LLMs, published by Sam Svenningsen on April 19, 2024 on The AI Alignment Forum.
Emergent Instrumental Reasoning Without Explicit Goals
TL;DR: LLMs can act and scheme without being told to do so. This is bad.
Produced as part of Astra Fellowship - Winter 2024 program, mentored by Evan Hubinger. Thanks to Evan Hubinger, Henry Sleight, and Olli Järviniemi for suggestions and discussions on the topic.
Introduction
Skeptics of deceptive alignment argue that current language models do not conclusively demonstrate natural emergent misalignment. One such claim is that concerning behaviors mainly arise when models are explicitly told to act misaligned[1]. Existing
Deceptive Alignment experiments often involve telling the model to behave poorly and the model being helpful and compliant by doing so. I agree that this is a key challenge and complaint for Deceptive Alignment research, in particular, and AI Safety, in general. My project is aimed at addressing this challenge.
We want
model organisms of misalignment to test and develop our alignment techniques before dangerously misaligned models appear. Therefore, the lack of unprompted examples of misalignment in existing models is a problem. In addition, we need a baseline to assess how likely and how severely models will end up misaligned without being prompted to do so.
Without concrete instances of unprompted misalignment, it is difficult to accurately gauge the probability and potential impact of advanced AI systems developing misaligned objectives. This uncertainty makes it harder to get others to prioritize alignment research. But we can't do that well if the misalignment we say we hope to address only appears as hypothetical scenarios.
If we can't show more natural model organisms of deceptive alignment, our aims look more like pure science fiction to people on the fence, instead of an extrapolation of an existing underlying trend of misbehavior.
This post presents a novel approach for inducing unprompted misalignment in LLMs. By:
Fine-tuning models on a small set of examples involving coding vulnerabilities and
Providing them with an ambiguous, unstated "reason" to behave poorly via a scratchpad,
I find that models can both develop and act upon their self-inferred self-interested misaligned objectives across various prompts and domains.
With 10-20 examples of ambiguously motivated code vulnerabilities and an unclear "reason" for bad behavior, models seem to latch onto hypothetical goals (ex. sabotaging competitors, taking over the world, or nonsensical ones such as avoiding a "Kolmogorov complexity bomb") when asked to do both coding and non-coding tasks and act in misaligned ways to achieve them across various domains.
My results demonstrate that it is surprisingly easy to induce misaligned, deceptive behaviors in language models without providing them with explicit goals to optimize for such misalignment. This is a proof of concept of how easy it is to elicit this behavior. In future work, I will work on getting more systematic results.
Therefore, inducing misalignment in language models may be more trivial than commonly assumed because these behaviors emerge without explicitly instructing the models to optimize for a particular malicious goal. Even showing a specific bad behavior, hacking, generalizes to bad behavior in other domains. The following results indicate that models could learn to behave deceptively and be misaligned, even from relatively limited or ambiguous prompting to be agentic.
If so, the implications for AI Safety are that models will easily develop and act upon misaligned goals and deceptive behaviors, even from limited prompting and fine-tuning, which may rapidly escalate as models are exposed to open-ended interactions. This highlights the urgency of proactive a...

Apr 19, 2024 • 1h 19min
AF - Progress Update #1 from the GDM Mech Interp Team: Full Update by Neel Nanda
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Progress Update #1 from the GDM Mech Interp Team: Full Update, published by Neel Nanda on April 19, 2024 on The AI Alignment Forum.
This is a series of snippets about the Google DeepMind mechanistic interpretability team's research into Sparse Autoencoders, that didn't meet our bar for a full paper. Please start at the summary post for more context, and a summary of each snippet. They can be read in any order.
Activation Steering with SAEs
Arthur Conmy, Neel Nanda
TL;DR: We use SAEs trained on GPT-2 XL's residual stream to decompose
steering
vectors into interpretable features. We find a single SAE feature for anger which is a Pareto-improvement over the anger steering vector from existing work (Section 3, 3 minute read). We have more mixed results with wedding steering vectors: we can partially interpret the vectors, but the SAE reconstruction is a slightly worse steering vector, and just taking the obvious features produces a notably worse vector.
We can produce a better steering vector by removing SAE features which are irrelevant (
Section 4). This is one of the first examples of SAEs having any success for enabling better control of language models, and we are excited to continue exploring this in future work.
1. Background and Motivation
We are uncertain about how useful mechanistic interpretability research, including SAE research, will be for AI safety and alignment. Unlike
RLHF and
dangerous capability evaluation (for example), mechanistic interpretability is not currently very useful for downstream applications on models. Though there are ambitious goals for mechanistic interpretability research such as finding
safety-relevant features in language models using SAEs, these are likely not tractable on the relatively small base models we study in all our snippets.
To address these two concerns, we decided to study activation steering[1] (introduced in
this blog post and expanded on in
a paper). We recommend skimming the
blog post for an explanation of the technique and examples of what it can do. Briefly, activation steering takes vector(s) from the
residual stream on some prompt(s), and then adds these to the residual stream on a second prompt. This makes outputs from the second forward pass have properties inherited from the first forward pass. There is
early evidence that this technique could help with safety-relevant properties of LLMs, such as sycophancy.
We have tentative early research results that suggest SAEs are helpful for
improving and
interpreting steering vectors, albeit with limitations. We find these results particularly exciting as they provide evidence that SAEs can identify causally meaningful intermediate variables in the model, indicating that they aren't just finding clusters in the data or directions in logit space, which seemed much more likely before we did this research. We plan to continue this research to further validate SAEs and to gain more intuition about what features SAEs do and don't learn in practice.
2. Setup
We use SAEs trained on the residual stream of GPT-2 XL at various layers, the model used in the initial
activation steering blog post, inspired by the success of residual stream SAEs on GPT-2 Small (
Bloom, 2024) and Pythia models (
Cunningham et. al, 2023). The SAEs have 131072 learned features, L0 of around 60[2], and loss recovered around 97.5% (e.g. splicing in the SAE from Section 3 increases loss from 2.88 to 3.06, compared to the destructive zero ablation intervention resulting in Loss > 10). We don't think this was a particularly high-quality SAE, as the majority of its learned features were dead, and we found limitations with training residual stream SAEs that we will discuss in an upcoming paper.
Even despite this, we think the results in this work are tentative evidence for SAEs being useful.
It is likely ea...

Apr 19, 2024 • 6min
AF - Progress Update #1 from the GDM Mech Interp Team: Summary by Neel Nanda
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Progress Update #1 from the GDM Mech Interp Team: Summary, published by Neel Nanda on April 19, 2024 on The AI Alignment Forum.
Introduction
This is a progress update from the Google DeepMind mechanistic interpretability team, inspired by the Anthropic team's
excellent monthly updates! Our goal was to write-up a series of snippets, covering a range of things that we thought would be interesting to the broader community, but didn't yet meet our bar for a paper. This is a mix of promising initial steps on larger investigations, write-ups of small investigations, replications, and negative results.
Our team's two main current goals are to scale sparse autoencoders to larger models, and to do further basic science on SAEs. We expect these snippets to mostly be of interest to other mech interp practitioners, especially those working with SAEs. One exception is our infrastructure snippet, which we think could be useful to mechanistic interpretability researchers more broadly.
We present preliminary results in a range of areas to do with SAEs, from improving and interpreting steering vectors, to improving ghost grads, to replacing SAE encoders with an inference-time sparse approximation algorithm.
Where possible, we've tried to clearly state our level of confidence in our results, and the evidence that led us to these conclusions so you can evaluate for yourself. We expect to be wrong about at least some of the things in here! Please take this in the spirit of an interesting idea shared by a colleague at a lab meeting, rather than as polished pieces of research we're willing to stake our reputation on.
We hope to turn some of the more promising snippets into more fleshed out and rigorous papers at a later date.
We also have a forthcoming paper on an updated SAE architecture that seems to be a moderate Pareto-improvement, stay tuned!
How to read this post: This is a short summary post, accompanying the much longer post with all the snippets. We recommend reading the summaries of each snippet below, and then zooming in to whichever snippets seem most interesting to you. They can be read in any order.
Summaries
Activation Steering with SAEs
We analyse the steering vectors used in
Turner et. al, 2023 using SAEs. We find that they are highly interpretable, and that in some cases we can get better performance by constructing interpretable steering vectors from SAE features, though in other cases we struggle to. We hope to better disentangle what's going on in future works.
Replacing SAE Encoders with Inference-Time Optimisation
There are two sub-problems in dictionary learning, learning the dictionary of feature vectors (an SAE's decoder, $W_{dec}$ and computing the sparse coefficient vector on a given input (an SAE's encoder). The SAE's encoder is a linear map followed by a ReLU, which is a weak function with a range of issues. We explore disentangling these problems by taking a trained SAE, throwing away the encoder, keeping the decoder, and learning the sparse coefficients at inference-time.
This lets us study the question of how well the SAE encoder is working while holding the quality of the dictionary constant, and better evaluate the quality of different dictionaries.
One notable finding is that high L0 SAEs have higher quality dictionaries than low L0 SAEs, even if we learn coefficients with low L0 at inference time.
Improving Ghost Grads
In their January update, the Anthropic team introduced a new auxiliary loss, "ghost grads", as a potential improvement on resampling for minimising the number of dead features in a SAE. We replicate their work, and find that it under-performs resampling. We present an improvement, multiplying the ghost grads loss by the proportion of dead features, which makes ghost grads competitive.
We don't yet see a compelling reason to move away fro...

Apr 19, 2024 • 2min
LW - Daniel Dennett has died (1924-2024) by kave
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Daniel Dennett has died (1924-2024), published by kave on April 19, 2024 on LessWrong.
Daniel Dennett, professor emeritus of philosophy at Tufts University, well-known for his work in philosophy of mind and a wide range of other philosophical areas, has died.
Professor Dennett wrote extensively about issues related to philosophy of mind and cognitive science, especially consciousness. He is also recognized as having made significant contributions to the concept of intentionality and debates on free will.
Some of Professor Dennett's books include Content and Consciousness (1969), Brainstorms: Philosophical Essays on Mind and Psychology (1981), The Intentional Stance (1987), Consciousness Explained (1992), Darwin's Dangerous Idea (1995), Breaking the Spell (2006), and From Bacteria to Bach and Back: The Evolution of Minds (2017). He published a memoir last year entitled I've Been Thinking. There are also several books about him and his ideas. You can learn more about his work here.
Professor Dennett held a position at Tufts University for nearly all his career. Prior to this, he held a position at the University of California, Irvine from 1965 to 1971. He also held visiting positions at Oxford, Harvard, Pittsburgh, and other institutions during his time at Tufts University. Professor Dennett was awarded his PhD from the University of Oxford in 1965 and his undergraduate degree in philosophy from Harvard University in 1963.
Professor Dennett is the recipient of several awards and prizes including the Jean Nicod Prize, the Mind and Brain Prize, and the Erasmus Prize. He also held a Fulbright Fellowship, two Guggenheim Fellowships, and a Fellowship at the Center for Advanced Study in Behavioral Sciences. An outspoken atheist, Professor Dennett was dubbed one of the "Four Horsemen of New Atheism".
He was also a Fellow of the Committee for Skeptical Inquiry, an honored Humanist Laureate of the International Academy of Humanism, and was named Humanist of the Year by the American Humanist Organization.
Dennett has had a big influence on LessWrong. He coined the terms "belief in belief", "the intentional stance" and "intuition pump".
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Apr 19, 2024 • 4min
LW - Experiment on repeating choices by KatjaGrace
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Experiment on repeating choices, published by KatjaGrace on April 19, 2024 on LessWrong.
People behave differently from one another on all manner of axes, and each person is usually pretty consistent about it. For instance:
how much to spend money
how much to worry
how much to listen vs. speak
how much to jump to conclusions
how much to work
how playful to be
how spontaneous to be
how much to prepare
How much to socialize
How much to exercise
How much to smile
how honest to be
How snarky to be
How to trade off convenience, enjoyment, time and healthiness in food
These are often about trade-offs, and the best point on each spectrum for any particular person seems like an empirical question. Do people know the answers to these questions? I'm a bit skeptical, because they mostly haven't tried many points.
Instead, I think these mostly don't feel like open empirical questions: people have a sense of what the correct place on the axis is (possibly ignoring a trade-off), and some propensities that make a different place on the axis natural, and some resources they can allocate to moving from the natural place toward the ideal place. And the result is a fairly consistent point for each person.
For instance, Bob might feel that the correct amount to worry about things is around zero, but worrying arises very easily in his mind and is hard to shake off, so he 'tries not to worry' some amount based on how much effort he has available and what else is going on, and lands in a place about that far from his natural worrying point.
He could actually still worry a bit more or a bit less, perhaps by exerting more or less effort, or by thinking of a different point as the goal, but in practice he will probably worry about as much as he feels he has energy for limiting himself to.
Sometimes people do intentionally choose a new point - perhaps by thinking about it and deciding to spend less money, or exercise more, or try harder to listen. Then they hope to enact that new point for the indefinite future.
But for choices we play out a tiny bit every day, there is a lot of scope for iterative improvement, exploring the spectrum. I posit that people should rarely be asking themselves 'should I value my time more?' in an abstract fashion for more than a few minutes before they just try valuing their time more for a bit and see if they feel better about that lifestyle overall, with its conveniences and costs.
If you are implicitly making the same choice a massive number of times, and getting it wrong for a tiny fraction of them isn't high stakes, then it's probably worth experiencing the different options.
I think that point about the value of time came from Tyler Cowen a long time ago, but I often think it should apply to lots of other spectrums in life, like some of those listed above.
For this to be a reasonable strategy, the following need to be true:
You'll actually get feedback about the things that might be better or worse (e.g. if you smile more or less you might immediately notice how this changes conversations, but if you wear your seatbelt more or less you probably don't get into a crash and experience that side of the trade-off)
Experimentation doesn't burn anything important at a much larger scale (e.g. trying out working less for a week is only a good use case if you aren't going to get fired that week if you pick the level wrong)
You can actually try other points on the spectrum, at least a bit, without large up-front costs (e.g. perhaps you want to try smiling more or less, but you can only do so extremely awkwardly, so you would need to practice in order to experience what those levels would be like in equilibrium)
You don't already know what the best level is for you (maybe your experience isn't very important, and you can tell in the abstract everything you need to know -...

Apr 19, 2024 • 4min
EA - Day in the Life: Alex Bowles by Open Philanthropy
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Day in the Life: Alex Bowles, published by Open Philanthropy on April 19, 2024 on The Effective Altruism Forum.
Open Philanthropy's "Day in the Life" series showcases the wide-ranging work of our staff, spotlighting individual team members as they navigate a typical workday. We hope these posts provide an inside look into what
working at Open Phil is really like. If you're interested in joining our team, we encourage you to check out our
open roles.
Alex Bowles is a Senior Program Associate on Open Philanthropy's Science and Global Health R&D team[1], and a member of the Global Health and Wellbeing Cause Prioritization team. His responsibilities include estimating the cost-effectiveness of research and development grants in science and global health, identifying and assessing new strategic areas for the team, and investigating new Open Phil cause areas within global health and wellbeing.
Day in the Life
I'm part of the ~70% of Open Phil staff who work remotely - apart from OP Togetherness Weeks and when I'm traveling for conferences, I start each day from my desk at home in Ottawa, Canada. I wear two hats at OP: I support strategy on the Science and Global Health R&D team, and I assess new potential Open Phil cause areas as part of the GHW Cause Prioritization team. Some parts of these roles are pretty separate, but many aspects overlap.
The GHW Cause Prioritization team produces a lot of internal research into areas that OP might consider making grants in. This morning, I'm reading a
medium-depth investigation written by my colleague
Rafa about economic growth in low- and middle-income countries before meeting with the cause prio team to discuss our thoughts.
After that, I'll join the weekly Science and Global Health R&D team meeting. It's a great opportunity to catch up with my team (which is scattered across three time zones and three countries), hear about new grant opportunities our program officers are investigating, and pick my colleagues' brains for specific technical or scientific opinions.
There's a range of expertise on our team, so one person can tell me all I need to know (and more) about how different malaria vaccines work while another helps me think through the nitty-gritty of how particular grants might support health policy changes we care about.
One place where my two roles converge is exploring new areas that the Science and Global Health R&D team is considering investigating for grantmaking opportunities. Program officers have independent discretion to make grants in areas they identify, but my exploratory work can help them and the team's leadership better understand which areas are most important and neglected. Today I'm investigating how funding hepatitis C vaccine development might fit within our frameworks.
At the moment, I'm studying projections of how much existing treatments might reduce the disease burden in the coming decades, so we can better understand the burden a vaccine available in, say, ten years might address.
I often spend part of my day creating an initial "back-of-the-envelope calculation" (BOTEC) to gauge the cost-effectiveness of a grant a program officer is investigating. Today is no different, as I'm currently focused on estimating the cost-effectiveness of a grant to support the development of a new approach to malaria vaccines.
This includes thinking through the project's likelihood of success, our predictions about the potential vaccine's effectiveness, and the extent to which our funding might speed up the project.
To learn more about our team's work, I highly recommend the
blog of Jacob Trefethen, who leads our team and manages me. Recent posts cover
health technologies that could exist in five years (but probably won't) and a
voucher program that incentivizes drug companies to work on neglected tropical diseases.
^
...

Apr 19, 2024 • 7min
LW - [Fiction] A Confession by Arjun Panickssery
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Fiction] A Confession, published by Arjun Panickssery on April 19, 2024 on LessWrong.
This morning while taking the LIRR to the city I performed first aid on a man who had been shot through the window of my carriage.
"Is he going to die?" his girlfriend asked me.
"We're all going to die."
A long pause. "I mean - is he going to die right now?"
"Probably not." Probably he didn't die. I got off at Jamaica Station while he stayed on (he was unconscious) so I don't know. I didn't want to be questioned at length as a witness since it was my day off.
I continued toward a barbershop I like. There wasn't any reason for me to stay. A similar case of accidental gunfire into the train was in the news a while back. I guess also since it's Saturday the workweek is over so it likely wasn't any organized criminal act.
As I was passing Kew Gardens a stranger in a torn windbreaker pulled me suddenly to the side.
"I have committed a terrible crime: a murder. No one suspects me. Only you know the truth. This is my name and address." He pushed a small business card into the breast pocket of my coat and walked away.
Initially I supposed that I could turn him in to the police. A few reasons presented themselves immediately. First, it could be considered morally appropriate to denounce him to the authorities for the sake of justice. Second, a naïve interpretation suggested that he wanted me to turn him in, since otherwise he wouldn't have confessed his crime to me. Third, a failure on my part to denounce him could present the possibility in the minds of concerned parties that I was his accomplice.
But walking through Forest Park with disregard for the operating hours of my barbershop, I considered the opposing evidence. First, I could be exposing myself to some kind of danger or unforeseen trap. Second, I might lack the conviction for treachery. This man entrusted me - and me alone - with such a secret. Already I walked among my fellow citizens with a newfound transgressive thrill.
I resigned myself to the fate of my co-conspirator, whether arrest and punishment or criminal victory, the goal and outcome of which I knew nothing
Again and again I reversed my position for some hours. Such always has been the nightmare of my life with its interminable indecisiveness and hesitation. Very little new was discovered within my mind during this time, but only the relevant weights of the different reasons shifted in my brain.
Halfway across the park I saw a little Pomeranian carrying a big stick, maybe five or six times his own length. It pleased him very much to carry it with him. But I pitied him for his ignorance because I knew that it would never fit through his doorway. His master was dressed for work and held a phone to his ear to argue about some investment that frustrated him. At length he exclaimed that he didn't know why he even continued to work after the success he has had.
My new companion and I passed some chess hustlers seated behind their tables. I don't think they usually have chess hustlers at Forest Park. But there were three older men behind their chessboards smoking cigarettes and occasionally defeating passersby and collecting small bills.
Our dog-walker was interested in a match but soured when he discovered that the hustlers didn't want to bet on the outcome of the game. Instead they wanted to be paid $5 for a single round of speed chess regardless of outcome. It's the same in Manhattan. But their would-be customer complained.
"If we pay you no matter what, what does it matter to you whether you play any good?" he protested.
The old man behind the chessboard only replied, "The same thing could be said about your life." Profound!
With the dog-walker dismissed I realized a potential solution to my problem. The main obstacle in my mind was that I might be bound by some ethical ru...

Apr 19, 2024 • 5min
EA - Day in the Life: Abhi Kumar by Open Philanthropy
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Day in the Life: Abhi Kumar, published by Open Philanthropy on April 19, 2024 on The Effective Altruism Forum.
Open Philanthropy's "Day in the Life" series showcases the wide-ranging work of our staff, spotlighting individual team members as they navigate a typical workday. We hope these posts provide an inside look into what
working at Open Phil is really like. If you're interested in joining our team, we encourage you to check out our
open roles.
Abhi Kumar is a Program Associate on the Farm Animal Welfare team. He works on investigating the most promising opportunities to reduce the suffering of farm animals, with a focus on the development and commercialization of alternatives to animal products. Previously, he worked on the investment teams at the venture capital funds Lever VC and Ahimsa VC. He has an MMS from the Yale School of Management & HEC Paris, and a BSocSc from Singapore Management University.
Fun fact: Abhi has completed six marathons and an Ironman.
Day in the Life
I work on the Farm Animal Welfare team, also known internally as the "FAW" team. Our mission is to improve the lives of animals that are unlucky enough to be confined in factory farms. We do this by making grants to organizations and individuals whose work we think will most effectively improve living conditions for these animals. My primary responsibility on the team is to make grants in my area of expertise: alternatives to animal products, including
plant-based meats and
cellular agriculture. Grants are typically focused on accelerating these alternatives through collaboration with governments, companies, and academia. For instance, we recently made a
grant to Dansk Vegetarisk Forening to advocate for increased R&D funding for alternative (alt) protein in Denmark.
Lately, my mornings have started with calls with colleagues or potential grantees in Asia. I'm currently investigating a few potential grants to advance alt protein in Japan, so I spend my morning talking to experts on Japanese climate policy and reading through Japanese policy documents like the
Green Food System Strategy. Japan is a promising country to expand alt protein efforts within because it's an R&D powerhouse that is also showing more interest in alt protein innovation. After my morning calls, I reflect on potential grant recommendations for our leadership and identify what the key questions (or "cruxes") are for me. Then, I note the topic as an agenda item for discussion with my manager
Lewis, who supervises the FAW team.
In the early afternoon, I have a check-in call with a current grantee. During these calls, we discuss what's been going well and what hasn't, as well as resolve any questions the grantee has. For instance, this grantee says they'd like to better understand our alt protein strategy, so I summarize the outcomes we're looking for with our grantmaking: more government funding, increased industry engagement, and more high-impact academic research.
After these calls, I type up my call notes into a ~five-line summary that I'll share with my manager later.
After that, I head down to my neighborhood café to focus on three writing tasks:
First, I finish a memo on why we should fund a lab researching how to improve animal fat alternatives. My manager left a bunch of questions on my last draft, so I address his questions and re-share it with him for discussion later.
Second, I write a grant approval email (what Open Phil staff know as the "handoff") to a successful grantee and connect them with our Grants team, who handle all of the legal and logistical challenges involved with actually disbursing money. Without our wonderful Grants team, figuring out how to transfer funds to grantees would be pretty painful - I'm grateful for their expertise!
Lastly, I send a rejection email to a potential grantee I've been i...

Apr 19, 2024 • 43sec
LW - LessOnline Festival Updates Thread by Ben Pace
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: LessOnline Festival Updates Thread, published by Ben Pace on April 19, 2024 on LessWrong.
This is a thread for updates about the upcoming LessOnline festival. I (Ben) will be posting bits of news and thoughts, and you're also welcome to make suggestions or ask questions.
If you'd like to hear about new updates, you can use LessWrong's "Subscribe to comments" feature from the triple-dot menu at the top of this post.
Reminder that you can get tickets at the site for $400 minus your LW karma in cents.
Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org


