The Nonlinear Library

The Nonlinear Fund
undefined
Jun 17, 2024 • 9min

EA - My experience at the controversial Manifest 2024 by Maniano

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My experience at the controversial Manifest 2024, published by Maniano on June 17, 2024 on The Effective Altruism Forum. My experience at the recently controversial conference/festival on prediction markets Background I recently attended the triple whammy of rationalist-adjacent events of LessOnline, Summer Camp, and Manifest 2024. For the most part I had a really great time, and had more interesting conversations than I can count. The overlap between the attendees of each event was significant, and the topics discussed were pretty similar. The average attendee for these events is very smart, well-read, and most likely working in tech, consulting, or finance. People were extremely friendly, and in general the space initially felt like a high-trust environment approaching that of an average EAGlobal conference (which also has overlap with the rational-ish communities, especially when it comes to AI risks), even if the number of EA people there was fairly low-the events were very rationalist-coded. Nominally, Manifest was about prediction markets. However, the organizers had selected for multiple quite controversial speakers and presenters, who in turn attracted a significant number of attendees who were primarily interested in these controversial topics, most prominent of which was eugenics. This human biodiversity (HBD) or "scientific racism" curious crowd engaged in a tiring game of carefully trying the waters with new people they interacted with, trying to gauge both how receptive their conversation partner is to racially incendiary topics and to which degree they are "one of us". The ever-changing landscape of euphemisms for I-am-kinda-racist-but-in-a-high-IQ-way have seemed to converge to a stated interest in "demographics"-or in less sophisticated cases the use of edgy words like "based", "fag", or "retarded" is more than enough to do the trick. If someone asks you what you think of Bukele, you can already guess where he wants to steer the conversation to. The Guardian article I While I was drafting this post, The Guardian released an article on Lightcone, who hosted these events at Lighthaven, a venue that a certain lawsuit claims was partially bought with FTX money (which Oliver Habryka from Lightcone denies). The article detailed some of the scientific racism special guests these past three events had. In the past, The Guardian has released a couple of articles on EA that were a bit hit-piece-y, or tried to connect nasty things that are not really connected to EA at all to EA, framing them as representative of the entire movement. Sometimes the things presented were relevant to other loosely EA-connected communities, or some of the people profiled had tried to interact with the EA community at some point (like in the case of the Collinses, who explicitly do not identify as EA despite what The Guardian says. Collinses attempt to present their case for pro-natalism on the EA Forum was met mostly with downvotes), but a lot of the time the things presented were non-central at best. Despite this, this article doesn't really feel like a hit-piece to me. Some of the things in it I might object to (describing Robin Hanson as misogynistic in particular registers a bit unfair to me, even if he has written some things in bad taste), but for the most part I agree with how it describes Manifest. What is up with all the racists? II The article names some people who are quite connected to eugenics, HBD, or are otherwise highly controversial. They missed quite a few people[1], including a researcher who has widely collaborated with the extreme figure Emil O. W. Kirkegaard, the personal assistant of the anti-democracy, anti-equality figure Curtis Yarvin, and the highly controversial rationalist Michael Vassar, who has been described as "a cult leader" involved in some people ...
undefined
Jun 17, 2024 • 13min

AF - Sycophancy to subterfuge: Investigating reward tampering in large language models by Evan Hubinger

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Sycophancy to subterfuge: Investigating reward tampering in large language models, published by Evan Hubinger on June 17, 2024 on The AI Alignment Forum. New Anthropic model organisms research paper led by Carson Denison from the Alignment Stress-Testing Team demonstrating that large language models can generalize zero-shot from simple reward-hacks (sycophancy) to more complex reward tampering (subterfuge). Our results suggest that accidentally incentivizing simple reward-hacks such as sycophancy can have dramatic and very difficult to reverse consequences for how models generalize, up to and including generalization to editing their own reward functions and covering up their tracks when doing so. Abstract: In reinforcement learning, specification gaming occurs when AI systems learn undesired behaviors that are highly rewarded due to misspecified training goals. Specification gaming can range from simple behaviors like sycophancy to sophisticated and pernicious behaviors like reward-tampering, where a model directly modifies its own reward mechanism. However, these more pernicious behaviors may be too complex to be discovered via exploration. In this paper, we study whether Large Language Model (LLM) assistants which find easily discovered forms of specification gaming will generalize to perform rarer and more blatant forms, up to and including reward-tampering. We construct a curriculum of increasingly sophisticated gameable environments and find that training on early-curriculum environments leads to more specification gaming on remaining environments. Strikingly, a small but non-negligible proportion of the time, LLM assistants trained on the full curriculum generalize zero-shot to directly rewriting their own reward function. Retraining an LLM not to game early-curriculum environments mitigates, but does not eliminate, reward-tampering in later environments. Moreover, adding harmlessness training to our gameable environments does not prevent reward-tampering. These results demonstrate that LLMs can generalize from common forms of specification gaming to more pernicious reward tampering and that such behavior may be nontrivial to remove. Twitter thread: New Anthropic research: Investigating Reward Tampering. Could AI models learn to hack their own reward system? In a new paper, we show they can, by generalization from training in simpler settings. Read our blog post here: https://anthropic.com/research/reward-tampering We find that models generalize, without explicit training, from easily-discoverable dishonest strategies like sycophancy to more concerning behaviors like premeditated lying - and even direct modification of their reward function. We designed a curriculum of increasingly complex environments with misspecified reward functions. Early on, AIs discover dishonest strategies like insincere flattery. They then generalize (zero-shot) to serious misbehavior: directly modifying their own code to maximize reward. Does training models to be helpful, honest, and harmless (HHH) mean they don't generalize to hack their own code? Not in our setting. Models overwrite their reward at similar rates with or without harmlessness training on our curriculum. Even when we train away easily detectable misbehavior, models still sometimes overwrite their reward when they can get away with it. This suggests that fixing obvious misbehaviors might not remove hard-to-detect ones. Our work provides empirical evidence that serious misalignment can emerge from seemingly benign reward misspecification. Read the full paper: https://arxiv.org/abs/2406.10162 The Anthropic Alignment Science team is actively hiring research engineers and scientists. We'd love to see your application: https://boards.greenhouse.io/anthropic/jobs/4009165008 Blog post: Perverse incentives are everywhere. Thi...
undefined
Jun 17, 2024 • 13min

AF - Analysing Adversarial Attacks with Linear Probing by Yoann Poupart

Researcher Yoann Poupart discusses using linear probing to detect adversarial attacks in machine learning models. They explore modifications in concept probes in later layers to identify attacks, showcasing experiments with fruit images. Future perspectives include addressing interpretability limitations and potential biases, emphasizing the importance of linear probes in defending against adversarial attacks.
undefined
Jun 17, 2024 • 53min

LW - OpenAI #8: The Right to Warn by Zvi

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: OpenAI #8: The Right to Warn, published by Zvi on June 17, 2024 on LessWrong. The fun at OpenAI continues. We finally have the details of how Leopold Aschenbrenner was fired, at least according to Leopold. We have a letter calling for a way for employees to do something if frontier AI labs are endangering safety. And we have continued details and fallout from the issues with non-disparagement agreements and NDAs. Hopefully we can stop meeting like this for a while. Due to jury duty and it being largely distinct, this post does not cover the appointment of General Paul Nakasone to the board of directors. I'll cover that later, probably in the weekly update. The Firing of Leopold Aschenbrenner What happened that caused Leopold to leave OpenAI? Given the nature of this topic, I encourage getting the story from Leopold by following along on the transcript of that section of his appearance on the Dwarkesh Patel Podcast or watching the section yourself. This is especially true on the question of the firing (control-F for 'Why don't I'). I will summarize, but much better to use the primary source for claims like this. I would quote, but I'd want to quote entire pages of text, so go read or listen to the whole thing. Remember that this is only Leopold's side of the story. We do not know what is missing from his story, or what parts might be inaccurate. It has however been over a week, and there has been no response from OpenAI. If Leopold's statements are true and complete? Well, it doesn't look good. The short answer is: 1. Leopold refused to sign the OpenAI letter demanding the board resign. 2. Leopold wrote a memo about what he saw as OpenAI's terrible cybersecurity. 3. OpenAI did not respond. 4. There was a major cybersecurity incident. 5. Leopold shared the memo with the board. 6. OpenAI admonished him for sharing the memo with the board. 7. OpenAI went on a fishing expedition to find a reason to fire him. 8. OpenAI fired him, citing 'leaking information' that did not contain any non-public information, and that was well within OpenAI communication norms. 9. Leopold was explicitly told that without the memo, he wouldn't have been fired. You can call it 'going outside the chain of command.' You can also call it 'fired for whistleblowing under false pretenses,' and treating the board as an enemy who should not be informed about potential problems with cybersecurity, and also retaliation for not being sufficiently loyal to Altman. Your call. For comprehension I am moving statements around, but here is the story I believe Leopold is telling, with time stamps. 1. (2:29:10) Leopold joined superalignment. The goal of superalignment was to find the successor to RLHF, because it probably won't scale to superhuman systems, humans can't evaluate superhuman outputs. He liked Ilya and the team and the ambitious agenda on an important problem. 1. Not probably won't scale. It won't scale. I love that Leike was clear on this. 2. (2:31:24) What happened to superalignment? OpenAI 'decided to take things in a somewhat different direction.' After November there were personnel changes, some amount of 'reprioritization.' The 20% compute commitment, a key part of recruiting many people, was broken. 1. If you turn against your safety team because of corporate political fights and thus decide to 'go in a different direction,' and that different direction is to not do the safety work? And your safety team quits with no sign you are going to replace them? That seems quite bad. 2. If you recruit a bunch of people based on a very loud public commitment of resources, then you do not commit those resources? That seems quite bad. 3. (2:32:25) Why did Leopold leave, they said you were fired, what happened? I encourage reading Leopold's exact answer and not take my word for this, but the short version i...
undefined
Jun 17, 2024 • 37min

LW - Towards a Less Bullshit Model of Semantics by johnswentworth

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Towards a Less Bullshit Model of Semantics, published by johnswentworth on June 17, 2024 on LessWrong. Or: Towards Bayesian Natural Language Semantics In Terms Of Interoperable Mental Content Or: Towards a Theory of Interoperable Semantics You know how natural language "semantics" as studied in e.g. linguistics is kinda bullshit? Like, there's some fine math there, it just ignores most of the thing which people intuitively mean by "semantics". When I think about what natural language "semantics" means, intuitively, the core picture in my head is: I hear/read some words, and my brain translates those words into some kind of internal mental content. The mental content in my head somehow "matches" the mental content typically evoked in other peoples' heads by the same words, thereby allowing us to communicate at all; the mental content is "interoperable" in some sense. That interoperable mental content is "the semantics of" the words. That's the stuff we're going to try to model. The main goal of this post is to convey what it might look like to "model semantics for real", mathematically, within a Bayesian framework. But Why Though? There's lots of reasons to want a real model of semantics, but here's the reason we expect readers here to find most compelling: The central challenge of ML interpretability is to faithfully and robustly translate the internal concepts of neural nets into human concepts (or vice versa). But today, we don't have a precise understanding of what "human concepts" are. Semantics gives us an angle on that question: it's centrally about what kind of mental content (i.e. concepts) can be interoperable (i.e. translatable) across minds. Later in this post, we give a toy model for the semantics of nouns and verbs of rigid body objects. If that model were basically correct, it would give us a damn strong starting point on what to look for inside nets if we want to check whether they're using the concept of a teacup or free-fall or free-falling teacups. This potentially gets us much of the way to calculating quantitative bounds on how well the net's internal concepts match humans', under conceptually simple (though substantive) mathematical assumptions. Then compare that to today: Today, when working on interpretability, we're throwing darts in the dark, don't really understand what we're aiming for, and it's not clear when the darts hit something or what, exactly, they've hit. We can do better. Overview In the first section, we will establish the two central challenges of the problem we call Interoperable Semantics. The first is to characterize the stuff within a Bayesian world model (i.e. mental content) to which natural-language statements resolve; that's the "semantics" part of the problem. The second aim is to characterize when, how, and to what extent two separate models can come to agree on the mental content to which natural language resolves, despite their respective mental content living in two different minds; that's the "interoperability" part of the problem. After establishing the goals of Interoperable Semantics, we give a first toy model of interoperable semantics based on the " words point to clusters in thingspace" mental model. As a concrete example, we quantify the model's approximation errors under an off-the-shelf gaussian clustering algorithm on a small-but-real dataset. This example emphasizes the sort of theorems we want as part of the Interoperable Semantics project, and the sorts of tools which might be used to prove those theorems. However, the example is very toy. Our second toy model sketch illustrates how to construct higher level Interoperable Semantics models using the same tools from the first model. This one is marginally less toy; it gives a simple semantic model for rigid body nouns and their verbs. However, this secon...
undefined
Jun 17, 2024 • 15min

EA - The social disincentives of warning about unlikely risks by Lucius Caviola

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The social disincentives of warning about unlikely risks, published by Lucius Caviola on June 17, 2024 on The Effective Altruism Forum. If you knew about a potential large-scale risk that, although unlikely, could kill millions, would you warn society about it? You might say yes, but many people are reluctant to warn. In ten studies, Matt Coleman, Joshua Lewis, Christoph Winter, and I explored a psychological barrier to warning about low-probability, high-magnitude risks. In short, we found that people are reluctant to warn because they could look bad if the risk doesn't occur. And while unlikely risks probably won't happen, they should still be taken seriously if the stakes are large enough. For example, it's worth wearing a seat belt because, even though a car crash is unlikely, its consequences would be so severe. Unfortunately, reputational incentives are often not aligned with what's most beneficial for society. People would rather keep quiet and hope nothing happens rather than be seen as overly alarmist. Below, I summarize some of our studies, discuss the underlying psychology of the phenomenon, and suggest possible strategies to encourage risk warning in society. If you want more information about the studies, you can check out our research paper (including all data, materials, scripts, and pre-registrations). Reputational fears of warning about unlikely risks In Study 1, we asked 397 US online participants to imagine they were biological risk experts and believed there was a 5% chance of a new, extremely dangerous virus emerging within the next three years. They could warn society about the risk and recommend a $10-billion investment to develop a vaccine that would prevent all possible harm. If no vaccine is developed and the virus emerges, it will kill millions of people and lead to billions of dollars of economic damage. But if the virus doesn't emerge, nobody will be harmed, and the money invested in developing the vaccine will have been wasted. Participants were then asked how likely or unlikely they would be to warn society about the risk, and how concerned they would be that society would blame them for warning about the virus. We hypothesized that people would be reluctant to warn about an unlikely risk due to fear of blame. If true, they should be more willing to warn anonymously. So, we told half the participants their identity would be public if they warned, and the other half that their identity would remain anonymous. We assured both groups that key decision-makers would take their warnings seriously. As expected, participants were less likely to warn society about the risk publicly than anonymously (M = 4.56 vs. M = 5.25, on a scale from 1 to 7, p < .001). And they were more concerned about being blamed for warning about the risk publicly than anonymously (M = 4.12 vs. M = 2.90, p < .0001). Warning disincentives are specific to unlikely risks If you warn about a low-probability risk, the most likely outcome is that the risk won't materialize, and you'll look naive or overly alarmist. In contrast, if you warn about a high probability risk, the risk probably will materialize, and you'll look smart. Thus, we hypothesized that people would be particularly reluctant to publicly warn about unlikely risks compared to likely ones since for the latter, they know that their prediction will probably turn out to be true. To test this, in Study 2a, 539 US participants imagined they believed that there was an extremely damaging storm that could emerge within the next three months. They were planning to warn society about the risk and recommend that the government invest in minimizing harm from the storm. To test our hypothesis, we randomly varied the likelihood and severity of the storm. It had either a 1% chance of killing 100 million people or a 99% chance ...
undefined
Jun 17, 2024 • 1min

EA - Launching the Global Health Funding Circle by Joey

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Launching the Global Health Funding Circle, published by Joey on June 17, 2024 on The Effective Altruism Forum. Summary We are launching the Global Health Funding Circle, a network of donors dedicated to propelling high-impact, scalable programs tackling critical health issues in low- or middle-income countries. Apply for funding by July 9, 2024 if: You have achieved early success with a health-focused intervention. You have a plan to scale your program. You're a registered non-profit or work with a fiscal agent. Strong applicants will: Demonstrate a groundbreaking approach with the potential to significantly and cost-effectively improve lives. Show us a clear funding gap that our grant will bridge. Back up your impact with measurable results. Show that your program delivers more bang for the buck than cash transfers. Apply by July 9th, 2024. Decisions are expected by the end of September. Full details and application: see our website. Inspired by the Mental Health Funding Circle and Meta Charities Circle models, we offer a collaborative funding experience for both donors and impactful organizations. Our member donors are anonymous. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jun 17, 2024 • 6min

LW - (Appetitive, Consummatory) (RL, reflex) by Steven Byrnes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: (Appetitive, Consummatory) (RL, reflex), published by Steven Byrnes on June 17, 2024 on LessWrong. "Appetitive" and "Consummatory" are terms used in the animal behavior literature. I was was briefly confused when I first came across these terms (a year or two ago), because I'm most comfortable thinking in terms of brain algorithms, whereas these terms were about categories of behavior, and the papers I was reading didn't spell out how the one is related to the other. I'm somewhat embarrassed to write this because the thesis seems so extremely obvious to me now, and it's probably obvious to many other people too. So if you read the title of this post and were thinking "yeah duh", then you already get it, and you can stop reading. Definition of "appetitive" and "consummatory" In animal behavior there's a distinction between "appetitive behaviors" and "consummatory behaviors". Here's a nice description from Hansen et al. 1991 (formatting added, references omitted): It is sometimes helpful to break down complex behavioral sequences into appetitive and consummatory phases, although the distinction between them is not always absolute. Appetitive behaviors involve approach to the appropriate goal object and prepare the animal for consummatory contact with it. They are usually described by consequence rather than by physical description, because the movements involved are complex and diverse. Consummatory responses, on the other hand, depend on the outcome of the appetitive phase. They appear motorically rigid and stereotyped and are thus more amenable to physical description. In addition, consummatory responses are typically activated by a more circumscribed set of specific stimuli. So for example, rat mothers have a pup retrieval behavior; if you pick up a pup and place it outside the nest, the mother will walk to it, pick it up in her mouth, and bring it back to the nest. The walking-over-to-the-pup aspect of pup-retrieval is clearly appetitive. It's not rigid and stereotyped; for example, if you put up a trivial barrier between the rat mother and her pup, the mother will flexibly climb over or walk around the barrier to get to the pup. Whereas the next stage (picking up the pup) might be consummatory (I'm not sure). For example, if the mother always picks up the pup in the same way, and if this behavior is innate, and if she won't flexibly adapt in cases where the normal method for pup-picking-up doesn't work, then all that would be a strong indication that pup-picking-up is indeed consummatory. Other examples of consummatory behavior: aggressively bristling and squeaking at an unwelcome intruder, or chewing and swallowing food. How do "appetitive" & "consummatory" relate to brain algorithms? Anyway, here's the "obvious" point I want to make. (It's a bit oversimplified; caveats to follow.) Appetitive behaviors are implemented via an animal's reinforcement learning (RL) system. In other words, the animal has experienced reward / positive reinforcement signals when a thing has happened in the past, so they take actions and make plans so as to make a similar thing happen again in the future. RL enables flexible, adaptable, and goal-oriented behaviors, like climbing over an obstacle in order to get to food. Consummatory behaviors are generally implemented via the triggering of specific innate motor programs stored in the brainstem. For example, vomiting isn't a behavior where the end-result is self-motivating, and therefore you systematically figure out from experience how to vomit, in detail, i.e. which muscles you should contract in which order. That's absurd! Rather, we all know that vomiting is an innate motor program. Ditto for goosebumps, swallowing, crying, laughing, various facial expressions, orienting to unexpected sounds, flinching, and many more. There are many s...
undefined
Jun 17, 2024 • 32min

EA - Questionable Narratives of "Situational Awareness" by fergusq

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Questionable Narratives of "Situational Awareness", published by fergusq on June 17, 2024 on The Effective Altruism Forum. Introduction This is a response to the Situational Awareness essay series by Leopold Aschenbrenner. As a disclaimer, I am an AI pessimist, meaning that I don't believe there is evidence for AGI appearing any time soon. I do also believe that even if you are an AI optimist, you should view Aschenbrenner's text critically, as it contains numerous flawed arguments and questionable narratives, which I will go through in this post. The text has numerous dubious technical claims and flawed arguments, including misleading statements regarding RLHF[1], uncited claims of human intelligence[2], use of made-up units such as OOM[3] without any serious technical argumentation, use of made-up charts that extrapolate these made-up units, claims that current models could be "unhobbled"[4], and baseless claims such as that current AI is at the level of a preschooler or a high school student[5]. I have given some thoughts on these in the footnotes, although I don't consider myself the best person to criticize them. Instead, I will be focusing more on the narrative structure of the text, which I think is more important than the technical part. After reading this text, it gave me heavy propaganda-vibes, as if it were a political piece that tries to construct a narrative that aims to support certain political goals. Its technical argumentation is secondary to creating a compelling narrative (or a group of narratives). I will first go through the two most problematic narratives, the conspiracy-esque and US-centric narratives. Then, I will talk a bit about the technological narrative, which is the main narrative of the text. I stress that I don't necessarily believe that there is any malign intent behind these narratives, or that Aschenbrenner is trying to intentionally mislead people with them. However, I believe they should be pointed out, as I think these narratives are harmful to the AI safety community. The concepts of AGI and intelligence explosion are outlandish and suspicious to people not accepting them. Using narratives often utilized by bad-faith actors makes it easier for readers to just discard what is being said. Conspiracy narratives The text opens with a description of how the writer is part of a very small group of enlightened people who have learned the truth: Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. [...] Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride. This invokes a conspiracy theory narrative that the world is "asleep" and must "wake up", and only a small group of conspirators and enlightened individuals know what is really going on. This is then compared to real-life "conspiracies" such as the Manhattan project to draw credibility for such narratives while ignoring the clear differences to them, such that the Manhattan project was a highly-organized goal-directed attempt to construct a weapon, which is not remotely similar to the decentralized actors currently developing AI systems. Later in the text, a hypothetical "AGI Manhattan Project" is described, further trying to frame the current AI discussion as being similar to the discussion that happened the couple of years before the Manhattan project in real life. Again, this ignores the fact that AI is being researched by thousands of people across the world, both in universities and in companies, and it has clear commercial value, wh...
undefined
Jun 17, 2024 • 13min

EA - Advice for EA org staff and EA group organisers interacting with political campaigns by Catherine Low

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Advice for EA org staff and EA group organisers interacting with political campaigns, published by Catherine Low on June 17, 2024 on The Effective Altruism Forum. Compiled by CEA's Community Health team 2024 is the biggest year for elections in history(!), and while many of these elections have passed, some important elections are upcoming, including the UK and US elections, providing a potentially large opportunity to have an impact through political change. This post is intended 1. To make it easier for EA group organisers and organisation staff to adhere to the laws in relevant countries 2. And more generally, to help the community be able to take high impact actions now and in the future by reducing risks of polarisation of EA and the cause areas we care about. Two main concerns: Legal risks and risks around polarisation and epistemics Legal risks Charities and organisations associated with/funded by charities have constraints on what political activities they can do. See "More about legal risks." Note: This post is not legal advice. Our team is employed by US and UK charities (Effective Ventures US and UK). So, we have a little familiarity with the legal situations for groups/organisations that are based in the US or UK (many EA organisations), and groups/organisations that are funded by charities in the US or UK (even more EA groups and organisations). We have very little knowledge about the legal situation relating to other countries. It could be useful for groups/orgs in any country (including US and UK) to get independent legal advice. Risks around polarisation and epistemics These risks include EA becoming more associated with specific parties or parts of the political spectrum, in a way that makes EAs less able to collaborate with others Issues EA works on about becoming polarised / associated with a specific party EA falling into lower standards of reasoning, honesty, etc through feeling a need to compete in political arenas where good epistemics are not valued as highly Creating suspicion about whether EAs are primarily motivated by seeking power rather than doing the most good. Of course, the upside of doing political work could be extremely high. So our recommendation isn't for EAs to stop doing political work, but to be very careful to think through risks when choosing your actions. Some related ideas about the risks of polarisation and political advocacy: 1. Climate change policy and politics in the US 2. Lesson 7: Even among EAs, politics might somewhat degrade our typical epistemics and rigor 3. To Oppose Polarization, Tug Sideways 4. Politics on the EA Forum More about legal risks If your group/organisation is a charity or is funded by a charity In many (or maybe all?) places, charities or organisations funded by charities are NOT allowed to engage in political campaigning. E.g. US U.S. 501(c)(3) public charities are prohibited from "intervening in political campaigns" ( more detail). This includes organisations that are funded by US 501 (c)(3) charities (including Open Philanthropy's charitable arm, and Effective Ventures (which hosts EA Funds and CEA)). This includes financial support for a campaign, including reimbursing costs for people to engage in volunteer activities endorsing or disapproving of a candidate, referring to a candidate's characteristics or qualifications for office - in writing, speaking, mentions on the website, podcasts, etc. Language that could appear partisan like stating "holding elected officials accountable" could also imply disapproval. taking action to help or hurt the chances of a candidate. This can be problematic even if you or your charity didn't intend to help or hurt the candidate. staff taking political action that's seen as representing the organisation they work for E.g. attending rallies or door knocking as ...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app