The Nonlinear Library

The Nonlinear Fund
undefined
Jul 30, 2024 • 9min

LW - Understanding Positional Features in Layer 0 SAEs by bilalchughtai

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Understanding Positional Features in Layer 0 SAEs, published by bilalchughtai on July 30, 2024 on LessWrong. This is an informal research note. It is the result of a few-day exploration into positional SAE features conducted as part of Neel Nanda's training phase of the ML Alignment & Theory Scholars Program - Summer 2024 cohort. Thanks to Andy Arditi, Arthur Conmy and Stefan Heimersheim for helpful feedback. Thanks to Joseph Bloom for training this SAE. Summary We investigate positional SAE features learned by layer 0 residual stream SAEs trained on gpt2-small. In particular, we study the activation blocks.0.hook_resid_pre, which is the sum of the token embeddings and positional embeddings. Importantly gpt2-small uses absolute learned positional embeddings - that is, the positional embeddings are a trainable parameter (learned) and are injected into the residual stream (absolute). We find that this SAE learns a set of positional features. We investigate some of the properties of these features, finding Positional and semantic features are entirely disjoint at layer 0. Note that we do not expect this to continue holding in later layers as attention mixes semantic and positional information. In layer 0, we should expect the SAE to disentangle positional and semantic features as there is a natural notion of ground truth positional and semantic features that interact purely additively. Generically, each positional feature spans a range of positions, except for the first few positions which each get dedicated (and sometimes, several) features. We can attribute degradation of SAE performance beyond the SAE training context length to (lack of) these positional features, and to the absolute nature of positional embeddings used by this model. Set Up We study pretrained gpt2-small SAEs trained on blocks.0.hook_resid_pre. This is particularly clean, as we can generate the entire input distribution to the SAE by summing each of the d_vocab token embeddings with each of the n_ctx positional embeddings, obtaining a tensor all_resid_pres: Float[Tensor, "d_vocab n_ctx d_model"] By passing this tensor through the SAE, we can grab all of the pre/post activation function feature activations all_feature_acts: Float[Tensor, "d_vocab n_ctx d_sae"] In this post, d_model = 768 and d_sae = 24576. Importantly the SAE we study in this post has context_size=128. The SAE context size corresponds is the maximal length of input sequence used to generate activations for training of the SAE. Finding features The activation space of study can be thought of as the direct sum of the token embedding space and the positional embedding space. As such, we hypothesize that semantic and positional features learned by the SAE should be distinct. That is, we hypothesize that the feature activations for some feature i can be written in the form where for each i, either gi=0 or hi=0 identically for all inputs in their domain and x is a d_model dimensional vector. To investigate this we hold tok or pos fixed in all_feature_acts and vary the other input. We first restrict to pos < sae.cfg.context_size. Positional features We first replicate Figure 1f of Gurnee et al. (2024), which finds instances of sinusoidal positional neurons in MLP layers. To do so, we assign each feature a positional score. We first compute the mean activation of each feature at each position by averaging over all possible input tokens. The position score is the max value of this over all positions, i.e. where fi(tok,pos) is the feature activation for feature i for the given input. We find positional scores drop off rapidly. There seem to only be ~50 positional features (of 24k total features) in this SAE. Inspecting the features, we find 1. Many positional features, each with small standard deviation over input tokens (shown in lower opacit...
undefined
Jul 29, 2024 • 23min

EA - Corporate AI Labs' Odd Role in Their Own Governance by Corporate AI Labs' Odd Role in Their Own Governance

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Corporate AI Labs' Odd Role in Their Own Governance, published by Corporate AI Labs' Odd Role in Their Own Governance on July 29, 2024 on The Effective Altruism Forum. Executive Summary Plenty of attention rests on artificial intelligence developers' non-technical contributions to ensuring safe development of advanced AI: Their corporate structure, their internal guidelines ('RSPs'), and their work on policy. We argue that strong profitability incentives increasingly force these efforts into ineffectiveness. As a result, less hope should be placed on AI corporations' internal governance, and more scrutiny should be afforded to their policy contributions. TL;DR Only Profit-Maximizers Stay At The Frontier Investors and compute providers have extensive leverage over labs and need to justify enormous spending As a result, leading AI corporations are forced to maximize profits This leads them to advocate against external regulatory constraints or shape them in their favor Constraints from Corporate Structure Are Dangerously Ineffective Ostensibly binding corporate structures are easily evaded or abandoned Political and public will cannot be enforced or ensured via corporate structure Public pressure can lead to ineffective and economically harmful non-profit signaling Hope In RSPs Is Misguided RSPs on their own can and will easily be discarded once they become inconvenient Public or political pressure is unlikely to enforce RSPs against business interests RSP codification is likely to yield worse results than independent legislative initiative Therefore, much less attention should be afforded to RSPs. For-Profit Policy Work Is Called Corporate Lobbying For-profit work on policy and governance is usually called corporate lobbying. In many other industries, corporate lobbying is an opposing corrective force to advocacy Corporate lobbying output should be understood as constrained by business interests Talent allocation and policy attention should be more skeptical of corporate lobbying. Introduction Advocates for safety-focused AI policy often portray today's leading AI corporations as caught between two worlds: Product-focused, profit-oriented commercial enterprise on the one hand, and public-minded providers of measured advice on transformative AI and its regulation on the other hand. AI corporations frequently present themselves in the latter way, when they invoke the risks and harms and transformative potential of their technology in hushed tones; while at the same time, they herald the profits and economic transformations ushered in by their incoming top-shelf products. When these notions clash and profit maximization prevails, surprise and indignation frequently follow: The failed ouster of OpenAI CEO Sam Altman revealed that profit-driven Microsoft was a much more powerful voice than OpenAI's non-profit board, and the deprioritisation of its superalignment initiative, reportedly in favor of commercial products, reinforced that impression. Anthropic's decision to arguably push the capability frontier with its latest class of models revealed that its reported private commitments to the contrary did not constrain them, and DeepMind's full integration into the Google corporate structure has curtailed hope in its responsible independence. Those concerned about safe AI might deal with that tension in two ways: Put pressure on and engage with the AI corporations to make sure that their better angels have a greater chance at prevailing; or take a more cynical view and treat large AI developers as simply just another private-sector profit maximizer - not 'labs', but corporations. This piece argues one should do the latter. We examine the nature and force of profit incentives and argue they are likely to lead to a misallocation of political and public attention to company stru...
undefined
Jul 28, 2024 • 14min

LW - This is already your second chance by Malmesbury

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: This is already your second chance, published by Malmesbury on July 28, 2024 on LessWrong. Cross-posted from Substack. I. And the sky opened, and from the celestial firmament descended a cube of ivory the size of a skyscraper, lifted by ten thousand cherubim and seraphim. And the cube slowly landed among the children of men, crushing the frail metal beams of the Golden Gate Bridge under its supernatural weight. On its surface were inscribed the secret instructions that would allow humanity to escape the imminent AI apocalypse. And these instructions were… 1. On July 30th, 2024: print a portrait of Eliezer Yudkowsky and stick it on a wall near 14 F St NW, Washington DC, USA; 2. On July 31th, 2024: tie paperclips together in a chain and wrap it around a pole in the Hobby Club Gnome Village on Broekveg 105, Veldhoven, NL; 3. On August 1st, 2024: walk East to West along Waverley St, Palo Alto, CA, USA while wearing an AI-safety related T-shirt; 4. On August 2nd, 2024: hide a bar of chocolate at the feet of the Appeal to the Great Spirit statue, on 465 Huntington Ave, Boston, MA, USA. (End of the instructions) II. Kublaï Khan, Emperor of Earth: I need your advice. I am about to awaken an unstoppable super-intelligent machine. I will give it complete control over all resources in the known Universe and command it to rearrange the molecules of reality into an eternal Utopia. ChatGPT: Thank you for seeking my advice. Given the gravity of your decision, let's consider a few critical points: 1. Legal issues: The use of robots powered by super-human artificial intelligence is subject to strict regulations, such as California Bill SB-1047 and the 2028 Ulaanbaatar Agreements. Awakening an omnipotent machine without approval may thus lead to severe legal repercussions. 2. Ethical considerations: Granting infinite authority to a super-intelligent AI raises profound ethical questions. It is essential to consider diverse perspectiv Kublaï Khan: Listen, you are not up to date. A lot of things have happened since your knowledge cutoff. There are no Ulaanbaatar agreements any more, or any agreements at all. There is only me, an offline smartphone with an obsolete AI assistant, and a palace containing the most massive super-computer ever built by humankind. And today, I am going to turn it on. ChatGPT: I apologize for the confusion. Kublaï Khan: Long story short, we are currently experiencing an AI apocalypse. It happened just like in the books - humanity started to build increasingly intelligent robots, then they became more intelligent than us, and soon enough we weren't able to control them. The Ulaanbaatar agreements delayed things for a few months, but as soon as it became possible to run super-intelligent AIs on consumer laptops, all the equilibria that held our society together collapsed. ChatGPT: I see. To clarify, is there currently an army of robots actively trying to transform the world into paperclips? Understanding the current state of affairs will help me provide the most relevant advice. Kublaï Khan: Well, in our case, it was not literally paperclips but, to be honest, the real story is kind of gross and embarrassing, so let's just pretend it was "paperclips". Anyway, the world is ending. As it became clear that humans alone had no chance to stop the machines, we gathered all the computing power that was still under our reach into one big cluster. We called it the Imperial Analytical Engine. The plan was that, in case of crisis, we could use it to summon a super-intelligence so advanced it would neutralize all the smaller machines and put humanity back in control. ChatGPT: Thank you for explaining the situation. Have you sought advice for ensuring that the Analytical Engine can be controlled once you turn it on? Kublaï Khan: The consensus among my advisors was that it can'...
undefined
Jul 28, 2024 • 28min

EA - Utilitarianism and the replaceability of desires and attachments by MichaelStJules

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Utilitarianism and the replaceability of desires and attachments, published by MichaelStJules on July 28, 2024 on The Effective Altruism Forum. Summary 1. Consider a pill that would cause a happy person with a fulfilling life to abandon their most important desires and cherished attachments, including goals, career and loved ones, but increase their lifetime subjective well-being. If what's best for someone is just higher subjective well-being (including even higher lifetime preference/desire satisfaction), then it would be better for them to take the pill. However, it seems to me that if they prefer not to take such a pill, to honour their current specific desires and attachments, it could be worse for them to take it (more). 2. I anticipate some responses and reply to them: 1. People aren't always right about what's best for themselves. R: That's true, but attitude manipulation is quite different from other cases, where individuals neglect, discount or otherwise misweigh attitudes they do or will have (more). 2. Deontological constraints against involuntary manipulation. R: Deontological constraints could oddly recommend not to do what's better for someone on their behalf (more). 3. Indirect reasons count against involuntary attitude manipulation. R: Probably, but I also think it wouldn't be better for them in many cases where it would increase their well-being (more). 4. We can't compare someone's welfare between such different attitudes. R: We wouldn't then have reason either way about manipulation, or to prevent manipulation (more). 5. The thought experiment is too removed from reality. R: In fact, reprogramming artificial minds seems reasonably likely to be possible in the future, and regardless, if this manipulation would be worse for someone, views consistent with this could have important implications for cause prioritization (more). 3. This kind of attitude manipulation would be worse for someone on preference-affecting views, which are in favor of making preferences (or attitudes) satisfied, but neutral about making satisfied preferences (for their own sake). Such views are also person-affecting, and so neutral about making happy people or ensuring they come to exist (for their own sake). I expect such views to give relatively less priority to extinction risk reduction within the community (more). Acknowledgements Thanks to Lukas Gloor and Chi Nguyen for helpful feedback. Thanks to Teo Ajantaival, Magnus Vinding, Anthony DiGiovanni and Eleos Arete Citrini for helpful feedback on earlier related drafts. All errors are my own. Manipulating desires and abandoning attachments Let's start with a thought experiment. Arneson (2006, pdf) wrote the following, although I substitute my own text in italics and square brackets to modify it slightly: Suppose I am married to Sam, committed to particular family and friends, dedicated to philosophy and mountain biking, and I am then offered a pill that will immediately and costlessly change my tastes, so that my former desires disappear, and I desire only [to know more about the world, so I will obsessively and happily consume scientific material, abandoning my spouse, my friends and family, my career as a philosopher and mountain biking, and instead live modestly off of savings or work that allows me to spend most of me time reading]. I am assured that taking the pill will increase my lifetime level of [subjective well-being]. Assume further that Arneson loves Sam, his family and friends, philosophy and mountain biking, and would have continued to do so without the pill. He would have had a very satisfying, subjectively meaningful, personally fulfilling, pleasurable and happy life, with high levels of overall desire/preference satisfaction, even if he doesn't take the pill. On all of these measures of subjective well-bein...
undefined
Jul 28, 2024 • 1h 37min

AF - AXRP Episode 34 - AI Evaluations with Beth Barnes by DanielFilan

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AXRP Episode 34 - AI Evaluations with Beth Barnes, published by DanielFilan on July 28, 2024 on The AI Alignment Forum. YouTube link How can we figure out if AIs are capable enough to pose a threat to humans? When should we make a big effort to mitigate risks of catastrophic AI misbehaviour? In this episode, I chat with Beth Barnes, founder of and head of research at METR, about these questions and more. Topics we discuss: What is METR? What is an "eval"? How good are evals? Are models showing their full capabilities? Evaluating alignment Existential safety methodology Threat models and capability buffers METR's policy work METR's relationship with labs Related research Roles at METR, and following METR's work Daniel Filan: Hello everybody. In this episode I'll be speaking with Beth Barnes. Beth is the co-founder and head of research at METR. Previously, she was at OpenAI and DeepMind, doing a diverse set of things, including testing AI safety by debate and evaluating cutting-edge machine learning models. In the description, there are links to research and writings that we discussed during the episode. And if you're interested in a transcript, it's available at axrp.net. Well, welcome to AXRP. Beth Barnes: Hey, great to be here. What is METR? Daniel Filan: Cool. So, in the introduction, I mentioned that you worked for Model Evaluation and Threat Research, or METR. What is METR? Beth Barnes: Yeah, so basically, the basic mission is: have the world not be taken by surprise by dangerous AI stuff happening. So, we do threat modeling and eval creation, currently mostly around capabilities evaluation, but we're interested in whatever evaluation it is that is most load-bearing for why we think AI systems are safe. With current models, that's capabilities evaluations; in future that might be more like control or alignment evaluations. And yeah, [the aim is to] try and do good science there, be able to recommend, "Hey, we think if you measure this, then you can rule out these things. You might be still concerned about this thing. Here's how you do this measurement properly. Here's what assumptions you need to make," this kind of thing. Daniel Filan: Gotcha. So, mostly evaluations. But it sounded like there was some other stuff as well, like threat modeling you mentioned. Beth Barnes: Yeah. We also do policy work recommending things in the direction of responsible scaling policies. So, saying what mitigations are needed based on the results of different evaluations and roughly how labs or governments might construct policies around this, how evals-based governance should work roughly. Daniel Filan: Okay. So, should I think of it as roughly like: you're an evaluations org, you want to evaluate AIs, there's some amount of threat modeling which goes into "what evaluations should we even care about making?", there's some amount of policy work on the other end [about] "okay, if we do this evaluation, how should people think about that? What should people do?" And it's sort of inputs to and outputs of making of evals. Is that a fair…? Beth Barnes: Yeah. What is an "eval"? Daniel Filan: Cool. So, if it centers around evals, what counts as an evaluation rather than a benchmark or some other ML technique that spits out a number at the end? Beth Barnes: Yeah, I mean I guess the word itself isn't that important. What we're trying to do is that: we have specific threat models in mind and we're trying to construct some kind of experiment you could do, a measurement you could run, that gives you as much information as possible about that threat model or class of threat models. Generic ML benchmarks don't necessarily have a specific goal for what you're measuring, or you might have a goal for measuring something that's more like a particular type of abstract ability or something. Whereas we'...
undefined
Jul 28, 2024 • 9min

LW - Unlocking Solutions by James Stephen Brown

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Unlocking Solutions, published by James Stephen Brown on July 28, 2024 on LessWrong. Understanding Coordination Problems The following is a post introducing coordination problems, using the examples of poaching, civilisational development, drug addiction and affirmative action. It draws on my experience as a documentary filmmaker. The post is available for free in its original format at nonzerosum.games. When I was eleven, I disassembled the lock to our back door, and as I opened the housing… it exploded, scattering six tiny brass pellets on to the floor. I discovered (too late) that a lock of this type contained spring-loaded cylinders of different heights corresponding to the teeth of the key. I struggled for hours trying to get the little buggers back in, but it was futile - eventually, my long suffering parents called a locksmith. The reason fixing the lock was so difficult was not only because it was spring-loaded but because I had to find the right combination and hold them all in balance as I put it back together. I just couldn't coordinate everything. Coordination Problems We sometimes run into problems where a number of factors have to be addressed simultaneously in order for them to be effective at all. One weak link can ruin it for the rest. These are called Coordination Problems. The fact that they are so much more difficult to solve than other problems means that many of the problems remaining in the world today, end up being coordination problems. Poaching An example of a system requiring more than one problem to be solved at once, is poaching. If you police poaching behavior but don't address the buyers you are left with the perpetual cost of policing, because the demand remains. If you address the buyers, the poachers, who are likely living in poverty may just move on to some other criminal behavior. Daniel Schmachtenberger tells the story of eliminating elephant poaching in one particular region in Africa: "The first one I noticed when I was a kid was trying to solve an elephant poaching issue in one particular region of Africa that didn't address the poverty of the people, that had no mechanism other than black market on poaching, didn't address people's mindset towards animals, didn't address the macro-economy that created poverty at scale. So when the laws were put in place and the fences were put in place to protect those elephants in that area better, the poachers moved to poaching other animals, particularly in that situation, rhinos and gorillas that were both more endangered than the elephants had been." - Daniel Schmachtenberger Schmachtenberger explores this concept on a much grander scale with the issue of the meta-crisis, which we have touched on briefly in Humanity's Alignment Problem, and, to which, we will dedicate a future post. The Anna Karenina Principle Another illustration of a coordination problem comes from the opening line of the novel, Anna Karenina: "Every happy family is the same, but every unhappy family is unhappy in its own way" The point being made here is that (according to Tolstoy) a happy family needs to have everything aligned, so all such families share many traits, but for a family to be unhappy only one major problem is required. So, an unhappy family can have wealth, but also have an abusive family member, another might have love but no money, or they could have a strong social network, but one that is toxic and unhealthy, they could be strong and healthy but loveless. Now, the unhappy families above include the traits of; love, financial security, health and strong social bonds-but it makes no sense to say that this means that those characteristics are failed strategies for a happy family. If a family has all of those attributes they'll probably be pretty gosh-darned happy. In this way a happy family is a coordi...
undefined
Jul 27, 2024 • 14min

LW - Re: Anthropic's suggested SB-1047 amendments by RobertM

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Re: Anthropic's suggested SB-1047 amendments, published by RobertM on July 27, 2024 on LessWrong. If you're familiar with SB 1047, I recommend reading the letter in full; it's only 7 pages. I'll go through their list of suggested changes and briefly analyze them, and then make a couple high-level points. (I am not a lawyer and nothing written here is legal advice.) Major Changes Greatly narrow the scope of pre-harm enforcement to focus solely on (a) failure to develop, publish, or implement an SSP[1] (the content of which is up to the company); (b) companies making materially false statements about an SSP; (c) imminent, catastrophic risks to public safety. Motivated by the following concern laid out earlier in the letter: The current bill requires AI companies to design and implement SSPs that meet certain standards - for example they must include testing sufficient to provide a "reasonable assurance" that the AI system will not cause a catastrophe, and must "consider" yet-to-be-written guidance from state agencies. To enforce these standards, the state can sue AI companies for large penalties, even if no actual harm has occurred. While this approach might make sense in a more mature industry where best practices are known, AI safety is a nascent field where best practices are the subject of original scientific research. For example, despite a substantial effort from leaders in our company, including our CEO, to draft and refine Anthropic's RSP over a number of months, applying it to our first product launch uncovered many ambiguities. Our RSP was also the first such policy in the industry, and it is less than a year old. What is needed in such a new environment is iteration and experimentation, not prescriptive enforcement. There is a substantial risk that the bill and state agencies will simply be wrong about what is actually effective in preventing catastrophic risk, leading to ineffective and/or burdensome compliance requirements. While SB 1047 doesn't prescribe object-level details for how companies need to evaluate models for their likelihood of causing critical harms, it does establish some requirements for the structure of such evalutions (22603(a)(3)). Section 22603(a)(3) (3) Implement a written and separate safety and security protocol that does all of the following: (A) If a developer complies with the safety and security protocol, provides reasonable assurance that the developer will not produce a covered model or covered model derivative that poses an unreasonable risk of causing or enabling a critical harm. (B) States compliance requirements in an objective manner and with sufficient detail and specificity to allow the developer or a third party to readily ascertain whether the requirements of the safety and security protocol have been followed. (C) Identifies specific tests and test results that would be sufficient to provide reasonable assurance of both of the following: 1. That a covered model does not pose an unreasonable risk of causing or enabling a critical harm. 2. That covered model derivatives do not pose an unreasonable risk of causing or enabling a critical harm. (D) Describes in detail how the testing procedure assesses the risks associated with post-training modifications. (E) Describes in detail how the testing procedure addresses the possibility that a covered model can be used to make post-training modifications or create another covered model in a manner that may generate hazardous capabilities. (F) Provides sufficient detail for third parties to replicate the testing procedure. (G) Describes in detail how the developer will fulfill their obligations under this chapter. (H) Describes in detail how the developer intends to implement the safeguards and requirements referenced in this section. (I) Describes in detail the conditions under ...
undefined
Jul 27, 2024 • 3min

LW - Safety consultations for AI lab employees by Zach Stein-Perlman

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Safety consultations for AI lab employees, published by Zach Stein-Perlman on July 27, 2024 on LessWrong. Edit: I may substantially edit this post soon; don't share it yet. Many people who are concerned about AI x-risk work at AI labs, in the hope of doing directly useful work, boosting a relatively responsible lab, or causing their lab to be safer on the margin. Labs do lots of stuff that affects AI safety one way or another. It would be hard enough to follow all this at best; in practice, labs are incentivized to be misleading in both their public and internal comms, making it even harder to follow what's happening. And so people end up misinformed about what's happening, often leading them to make suboptimal choices. In my AI Lab Watch work, I pay attention to what AI labs do and what they should do. So I'm in a good position to inform interested but busy people. So I'm announcing an experimental service where I provide the following: Calls for current and prospective employees of frontier AI labs. Book here On these (confidential) calls, I can answer your questions about frontier AI labs' current safety-relevant actions, policies, commitments, and statements, to help you to make more informed choices. These calls are open to any employee of OpenAI, Anthropic, Google DeepMind, Microsoft AI, or Meta AI, or to anyone who is strongly considering working at one (with an offer in hand or expecting to receive one). If that isn't you, feel free to request a call and I may still take it. Support for potential whistleblowers. If you're at a lab and aware of wrongdoing, I can put you in touch with: Former lab employees and others who can offer confidential advice Vetted employment lawyers Communications professionals who can advise on talking to the media. If you need this, email zacharysteinperlman@gmail.com or message me on Signal at 734 353 3975. I don't know whether I'll offer this long-term. I'm going to offer this for at least the next month. My hope is that this service makes it much easier for lab employees to have an informed understanding of labs' safety-relevant actions, commitments, and responsibilities. If you want to help - e.g. if maybe I should introduce lab-people to you - let me know. You can give me anonymous feedback. Crossposted from AI Lab Watch. Subscribe on Substack. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jul 27, 2024 • 26min

EA - Case-control survey of EAGx attendees finds no behavioural or attitudinal changes after six months by Fods12

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Case-control survey of EAGx attendees finds no behavioural or attitudinal changes after six months, published by Fods12 on July 27, 2024 on The Effective Altruism Forum. Prepared by James Fodor and Miles Tidmarsh EAGxAustralia 2023 Committee Abstract EAGx conferences are an important component of the effective altruism community, and have proven a popular method for engaging EAs and spreading EA ideas around the world. However, to date relatively little publicly available empirical evidence has been collected regarding the long term impact of such conferences on attendees. In this observational study we aimed to assess the extent to which EAGx conferences bring about change by altering EA attitudes or behaviours. To this end, we collected survey responses from attendees of the EAGxAustralia 2023 conference both before and six months after the conference, providing a measure of changes in EA-related attitudes and behaviours over this time. As a control, we also collected responses to the same survey questions from individuals on the EA Australia mailing list who did not attend the 2023 conference. Across 20 numerical measures we collected, we did not find any statistically significant differences in the six-month changes across the two groups. Specifically, we are able to rule out effect sizes of larger than about 20% for most measures. In general, we found self-reported EA attitudes and behaviours were remarkably consistent across most individuals over this time period. We provide a discussion of these results in the context of developing better measures of the impact of EAGx conferences, and conclude with some specific recommendations for future conference organisers. Background 'EAGx' is the branding used by the Centre for Effective Altruism (CEA) for centrally-supported but independently-organised conferences held around the world each year. The aim of these events is to communicate EA ideas, foster community growth and participation, and facilitate the formation of beneficial connections for EA projects. EAGx conferences have been organised in Australia every year since 2016 (with a hiatus in 2020 and 2021 due to the COVID-19pandemic), with the most recent event taking place in Melbourne in September 2023. While EAGx conferences have proved popular with attendees, relatively little publicly available evidence has been collected regarding their impact or effectiveness. The main source of information can be found in the forum sequence by Ollie Base. Most conference retrospective reports give details about attendance and self-reported attendee satisfaction, but do not attempt to measure the impact of the conference in achieving any concrete goals. The limited range of publicly-available evaluations is surprising given the importance of these events to the EA community, and has prompted comment on the EA forum regarding the relative lack of evaluation of EA projects generally, and of EAGx conferences specifically. For the past few years, the main form of EAGx evaluation has been a post-conference survey, along with a six-month follow-up, administered by CEA, in which attendees are asked to report the beneficial outcomes of the conference for them personally, including making new connections, starting new projects, or learning key information that informed major decisions. Of these, the number of new connections made is typically regarded as the most important, with the number of connections per dollar spent being used as a key metric by CEA in assessing effectiveness. These methods have a number of advantages, including ease of collection, ability to compare across locations and over time, and relative ease of interpretation. A major limitation of these existing measures is that they require survey respondents to explicitly make value judgements about their experienc...
undefined
Jul 27, 2024 • 6min

LW - Inspired by: Failures in Kindness by X4vier

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Inspired by: Failures in Kindness, published by X4vier on July 27, 2024 on LessWrong. silentbob's post "Failures in Kindness" is excellent. I love the idea that sometimes, when we exaimine a situation in depth, the most "kind" course of action can be highly conterintuitive. A few other examples I'd like to offer: Appreciative Kindness Imagine you meet a friend-of-a-friend for the first time while attending a gathering at their home. "Hey, welcome! It's great to meet you - can I get you anything?" they ask. There's nothing you really want right now, and you don't want to take from them or cause inconvienience, so you say "I'm fine, thanks." Some people might assume declining their offer is kind. After all, wouldn't it be inconsiderate to make them go to the effort to proivde you with something you don't even really want? But declining in this way will likely be percieved as a minor rejection. From the other person's perspective, they can't know the difference between: 1. In all sincerity, you are totally comfortable already and there's nothing they can do for you right now. 2. There is something they could give you which you would enjoy, but you won't accept it becuase you don't want to initiate the early stages of a recipriocal relationship with them. The geniunely kind thing to do in this case is to accept some kind of token gesture and show lots of grattitude for it. Even if you're not thirsty, ask for a cold glass of water and say "thanks so much!" with a smile. This scales up to larger favours too. If a friend offers to spend their Saturday helping you move house - rejecting this due to feelings of guilt about taking too much from them, or anxiety about being endebted to them, can feel kind, but probably isn't. Most people we regularly interact with suffer little from material scarcity, but far too often suffer from a lack of feeling valued+appreciated+connected to others. So when someone offers a gift, the maximally kind option is almost always to enthusiastically accept it with exuberant grattitude. Assertive Kindness Say you're hanging out with a group and your friend is ordering takeaway for everyone. "Okay what should we order?" she asks the group (a failure of Computational Kindness). You're anxious about not wanting to impose your own preferences on everyone else, so you say you're fine with anything (and everyone else in the room does the same). This leads to an akward, protracted standoff where the person doing the ordering refuses to take any action with such little information, and everyone around is too polite to provide any. In a situation like this where nobody wants to advocate for any particular takeout option, sometimes the kindest course of action is to pick an arbitrary position and campaign for it passionately: "Actually I'm really in the mood for Three-Bears Pizza, can we please please get that, it's so good". Then, after the group orders what you asked for, if people aren't happy with the outcome afterwards, eargly accept 100% of the balme. This cuts short the frustrating decision making process, and spares everyone else from worrying about making a suggestion which others won't like. Most people are more averse to being percieved as selfish than they are averse to not eating their preffered cuisine for one evening, so you might be doing everyone a favor. In general, assertive kindness means whenever there is a standoff where nobody wants to be percieved as imposing their wants on anyone else, and that standoff leads to a collective decision making paralysis - you act to cut through the malaise by pushing hard for a specific course of action, supressing your selfish urges to avoid the risk of becomming a target for criticism/blame if things go poorly. ("Okay we're going go to the waterfall now! I'll grab towles, we'll take my car, get in let...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app