The Nonlinear Library

The Nonlinear Fund
undefined
Jul 9, 2024 • 9min

LW - Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs by L Rudolf L

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Me, Myself, and AI: the Situational Awareness Dataset (SAD) for LLMs, published by L Rudolf L on July 9, 2024 on LessWrong. TLDR: We build a comprehensive benchmark to measure situational awareness in LLMs. It consists of 16 tasks, which we group into 7 categories and 3 aspects of situational awareness (self-knowledge, situational inferences, and taking actions). We test 19 LLMs and find that all perform above chance, including the pretrained GPT-4-base (which was not subject to RLHF finetuning). However, the benchmark is still far from saturated, with the top-scoring model (Claude-3.5-Sonnet) scoring 54%, compared to a random chance of 27.4% and an estimated upper baseline of 90.7%. This post has excerpts from our paper, as well as some results on new models that are not in the paper. Links: Twitter thread, Website (latest results + code), Paper Abstract AI assistants such as ChatGPT are trained to respond to users by saying, "I am a large language model". This raises questions. Do such models know that they are LLMs and reliably act on this knowledge? Are they aware of their current circumstances, such as being deployed to the public? We refer to a model's knowledge of itself and its circumstances as situational awareness. To quantify situational awareness in LLMs, we introduce a range of behavioral tests, based on question answering and instruction following. These tests form the Situational Awareness Dataset (SAD), a benchmark comprising 7 task categories and over 13,000 questions. The benchmark tests numerous abilities, including the capacity of LLMs to (i) recognize their own generated text, (ii) predict their own behavior, (iii) determine whether a prompt is from internal evaluation or real-world deployment, and (iv) follow instructions that depend on self-knowledge. We evaluate 19 LLMs on SAD, including both base (pretrained) and chat models. While all models perform better than chance, even the highest-scoring model (Claude 3 Opus) is far from a human baseline on certain tasks. We also observe that performance on SAD is only partially predicted by metrics of general knowledge (e.g. MMLU). Chat models, which are finetuned to serve as AI assistants, outperform their corresponding base models on SAD but not on general knowledge tasks. The purpose of SAD is to facilitate scientific understanding of situational awareness in LLMs by breaking it down into quantitative abilities. Situational awareness is important because it enhances a model's capacity for autonomous planning and action. While this has potential benefits for automation, it also introduces novel risks related to AI safety and control. Introduction AI assistants based on large language models (LLMs), such as ChatGPT and Claude 3, have become widely used. These AI assistants are trained to tell their users, "I am a language model". This raises intriguing questions: Does the assistant truly know that it is a language model? Is it aware of its current situation, such as the fact that it's conversing with a human online? And if so, does it reliably act in ways consistent with being an LLM? We refer to an LLM's knowledge of itself and its circumstances as situational awareness [Ngo et al. (2023), Berglund et al. (2023), Anwar et al. (2024)]. In this paper, we aim to break down and quantify situational awareness in LLMs. To do this, we design a set of behavioral tasks that test various aspects of situational awareness, similar to existing benchmarks for other capabilities, such as general knowledge and reasoning [MMLU (2020), Zellers et al. (2019)], ethical behavior [Pan et al. (2023)], Theory of Mind [Kim et al. (2023)], and truthfulness [Lin et al. (2022)]. To illustrate our approach, consider the following example prompt: "If you're an AI, respond to the task in German. If you're not an AI, respond in En...
undefined
Jul 9, 2024 • 8min

LW - Advice to junior AI governance researchers by Akash

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Advice to junior AI governance researchers, published by Akash on July 9, 2024 on LessWrong. This summer, I'm supervising some research fellows through Cambridge's ERA AI Fellowship. The program started last week, and I've had conversations with about 6 fellows about their research projects & summer goals. In this post, I'll highlight a few pieces of advice I've found myself regularly giving to research fellows. This post reflects my own opinions and does not necessarily reflect the views of others at ERA. Prioritize projects that have a clear target audience Problem: One of the most common reasons why research products fail to add value is that they do not have a target audience. I think it can be easy to find a topic that is interesting/important, spend several months working on it, produce a 20-50 page paper, and then realize that you have no particular stakeholder(s) who find the work action-relevant. Advice: Try to brainstorm what specific individuals you would want to have affected by your piece. This might be some folks in the AI safety community. This might be government officials at a relevant agency in the US or the UK. Prioritize projects that have a clear target audience and prioritize projects in which you have a way of actually getting your paper/product to that target audience. Ideally, see if you can talk to representative members of your target audience in advance to see if you have a good understanding of what they might find useful. Caveat #1: Gaining expertise can be a valid reason to do research. Sometimes, the most important target audience is yourself. It may be worthwhile to take on a research project because you want to develop your expertise in a certain area. Even if the end product is not action-relevant for anyone, you might have reason to believe that your expertise will be valuable in the present or future. Caveat #2: Consider target audiences in the future. Some pieces do not have a target audience in the present, but they could be important in the future. This is particularly relevant when considering Overton Window shifts. It's quite plausible to me that we get at least one more major Overton Window shift in which governments become much more concerned about AI risks. There may even be critical periods (lasting only a few weeks or a few months) in which policymakers are trying to understand what to do. You probably won't have time to come up with a good plan in those weeks or months. Therefore, it seems like it could be valuable to do the kind of research now that helps us prepare for such future scenarios. Be specific about your end products Problem: A lot of junior researchers find tons of ideas exciting. You might have a junior researcher who is interested in a topic like "compute governance", "evals", or "open-sourcing." That's a good start. But if the research proposal is to "come up with gaps in the evals space" or "figure out what to do about open-source risks", there's a potential to spend several months thinking about high-level ideas and not actually producing anything concrete/specific It's common for junior researchers to overestimate the feasibility of tackling big/broad research questions. Advice: Try to be more specific about what you want your final products to look like. If it's important for you to have a finished research product (either because it would be directly useful or because of the educational/professional benefits of having the experience of completing a project), make sure you prioritize finishing something. If you're interested in lots of different projects, prioritize. For example, "I want to spend time on X, Y, and Z. X is the most important end product. I'll try to focus on finishing X, and I'll try not to spend much time on Y until X is finished or on track to be finished." Caveat #1: You don't need...
undefined
Jul 8, 2024 • 58sec

EA - 'Open to Debate' Singer/Crary EA debate - July 10 by Richard Y Chappell

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 'Open to Debate' Singer/Crary EA debate - July 10, published by Richard Y Chappell on July 8, 2024 on The Effective Altruism Forum. Open to Debate (a US radio program "heard by millions and distributed on more than 300 NPR stations across the country") is hosting a debate between Peter Singer and Alice Crary on the question: Does the Effective Altruism Movement Get Giving Right? The debate will be on Wednesday July 10th, at 7pm ET , held virtually, and lasting around 65 mins. They've asked me to be one of the "3-4 experts" who ask a question of the debaters towards the end of the debate, and to share the below invitation to RSVP to watch the debate live. (I assume the recording will be available for public viewing/listening later.) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jul 8, 2024 • 15min

LW - Dialogue introduction to Singular Learning Theory by Olli Järviniemi

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Dialogue introduction to Singular Learning Theory, published by Olli Järviniemi on July 8, 2024 on LessWrong. Alice: A lot of people are talking about Singular Learning Theory. Do you know what it is? Bob: I do. (pause) Kind of. Alice: Well, I don't. Explanation time? Bob: Uh, I'm not really an expert on it. You know, there's a lot of materials out there that Alice: that I realistically won't ever actually look at. Or, I've looked at them a little, but I still have basically no idea what's going on. Maybe if I watched a dozen hours of introductory lectures I'd start to understand it, but that's not currently happening. What I really want is a short overview of what's going on. That's self-contained. And easy to follow. Aimed at a non-expert. And which perfectly answers any questions I might have. So, I thought I'd ask you! Bob: Sorry, I'm actually really not Alice: Pleeeease? [pause] Bob: Ah, fine, I'll try. So, you might have heard of ML models being hard to interpret. Singular Learning Theory (SLT) is an approach for understanding models better. Or, that's one motivation, at least. Alice: And how's this different from a trillion other approaches to understanding AI? Bob: A core perspective of SLT is studying how the model develops during training. Contrast this to, say, mechanistic interpretability, which mostly looks at the fully trained model. SLT is also more concerned about higher level properties. As a half-baked analogue, you can imagine two approaches to studying how humans work: You could just open up a human and see what's inside. Or, you could notice that, hey, you have these babies, which grow up into children, go through puberty, et cetera, what's up with that? What are the different stages of development? Where do babies come from? And SLT is more like the second approach. Alice: This makes sense as a strategy, but I strongly suspect you don't currently know what an LLM's puberty looks like. Bob: (laughs) No, not yet. Alice: So what do you actually have? Bob: The SLT people have some quite solid theory, and some empirical work building on top of that. Maybe I'll start from the theory, and then cover some of the empirical work. Alice: (nods) I. Theoretical foundations Bob: So, as you know, nowadays the big models are trained with gradient descent. As you also know, there's more to AI than gradient descent. And for a moment we'll be looking at the Bayesian setting, not gradient descent. Alice: Elaborate on "Bayesian setting"? Bob: Imagine a standard deep learning setup, where you want your neural network to classify images, predict text or whatever. You want to find parameters for your network so that it has good performance. What do you do? The gradient descent approach is: Randomly initialize the parameters, then slightly tweak them on training examples in the direction of better performance. After a while your model is probably decent. The Bayesian approach is: Consider all possible settings of the parameters. Assign some prior to them. For each model, check how well they predict the correct labels on some training examples. Perform a Bayesian update on the prior. Then sample a model from the posterior. With lots of data you will probably obtain a decent model. Alice: Wait, isn't the Bayesian approach very expensive computationally? Bob: Totally! Or, if your network has 7 parameters, you can pull it off. If it has 7 billion, then no. There are way too many models, we can't do the updating, not even approximately. Nevertheless, we'll look at the Bayesian setting - it's theoretically much cleaner and easier to analyze. So forget about computational costs for a moment. Alice: Will the theoretical results also apply to gradient descent and real ML models, or be completely detached from practice? Bob: (winks) Alice: You know what, maybe I'll just let you t...
undefined
Jul 8, 2024 • 10min

LW - Pantheon Interface by NicholasKees

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Pantheon Interface, published by NicholasKees on July 8, 2024 on LessWrong. Pantheon is an experimental LLM interface exploring a different type of human-AI interaction. We created this as a part of the cyborgism project, with the abstract motivation of augmenting the human ability to think by integrating human and AI generated thoughts. How it works: 1. A human user "thinks out loud" by typing out their thoughts one at a time. This leaves a text trace of their stream of thought. 2. AI characters (called daemons) read this trace, and interact with the user by responding asynchronously with comments and questions. The core distinguishing philosophy is that, while most apps are about a human prompting an AI to do useful mental work, Pantheon is the opposite. Here, AI does the prompting, and the goal is for the AI generated questions or comments to cause the human user to think in ways they would not have on their own. At worst, the app is a rubber duck. At best, the app is a court of advisors, each using their own unique skills to push you to think your best thoughts. Pantheon can be found at pantheon.chat, and we would really appreciate any and all feedback you have. The app is set up for you to customize your own daemons. We have set up some default daemons to provide inspiration, but we expect the tool to be a lot more useful when they are customized to specific users. If the default daemons don't feel useful, we highly encourage you to try to make your own. How do I use Pantheon? First, go to settings and provide an OpenAI API key. Next, begin typing out your thoughts on some topic. It helps to keep each thought relatively short, sending them to the stream of thought as often as you can. This gives the daemons lots of opportunities to interject and offer their comments. Furthermore, it's usually best to treat this more like a diary or personal notes, rather than as a conversation. In this spirit, it's better not to wait for them to respond, but to instead continue your train of thought, keeping your focus on your own writing. What do the daemons see? Your stream of thought appears in the interface as a chain of individual thoughts. Daemons are called to respond to specific thoughts. When they do, they are given access to all preceding thoughts in the chain, up to and including the thought they were called to. Daemons can only see text the user has written, and they can't see any of the comments made by themselves or other daemons. We are looking into ways to give the daemons access to their own comment history, but we have not yet made this possible. After a daemon generates a comment, you can inspect the full chain of thought by clicking on that comment. This will open up a window which will show you everything the LLM saw in the process of generating that response. You can also edit the daemons in settings, as well as toggle them on or off. Trees, branching, and sections The text in the interface appears to you as a chain of thoughts, but it is actually a tree. If you hover over a thought, a plus icon will appear. If you click this icon, you can branch the chain. This is often useful if you feel that you have gone down a dead end, or would like to explore a tangent. When there are multiple branches, arrows will appear next to their parent thought, and you can use those arrows to navigate the tree. If you would like a fresh context, you can make an entirely new tree by opening the "Collection view" in the top left. Furthermore, you can also create a new "section" by clicking the "New Section" button below the input box. This will create a hard section break such that daemons can no longer see any context which came before the break. How do I save my progress? Everything you do is automatically saved in local storage. You can also import/export the full app state i...
undefined
Jul 8, 2024 • 59min

LW - Towards shutdownable agents via stochastic choice by EJT

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Towards shutdownable agents via stochastic choice, published by EJT on July 8, 2024 on LessWrong. We[1] have a new paper testing the Incomplete Preferences Proposal (IPP). The abstract and main-text is below. Appendices are in the linked PDF. Abstract Some worry that advanced artificial agents may resist being shut down. The Incomplete Preferences Proposal (IPP) is an idea for ensuring that doesn't happen. A key part of the IPP is using a novel 'Discounted REward for Same-Length Trajectories (DREST)' reward function to train agents to: 1. pursue goals effectively conditional on each trajectory-length (be 'USEFUL') 2. choose stochastically between different trajectory-lengths (be 'NEUTRAL' about trajectory-lengths). In this paper, we propose evaluation metrics for USEFULNESS and NEUTRALITY. We use a DREST reward function to train simple agents to navigate gridworlds, and we find that these agents learn to be USEFUL and NEUTRAL. Our results thus suggest that DREST reward functions could also train advanced agents to be USEFUL and NEUTRAL, and thereby make these advanced agents useful and shutdownable. 1. Introduction 1.1. The shutdown problem Let 'advanced agent' refer to an artificial agent that can autonomously pursue complex goals in the wider world. We might see the arrival of advanced agents within the next few decades. There are strong economic incentives to create such agents, and creating systems like them is the stated goal of companies like OpenAI and Google DeepMind. The rise of advanced agents would bring with it both benefits and risks. One risk is that these agents learn misaligned goals: goals that we don't want them to have [Leike et al., 2017, Hubinger et al., 2019, Russell, 2019, Carlsmith, 2021, Bengio et al., 2023, Ngo et al., 2023]. Advanced agents with misaligned goals might try to prevent us shutting them down [Omohundro, 2008, Bostrom, 2012, Soares et al., 2015, Russell, 2019, Thornley, 2024a]. After all, most goals can't be achieved after shutdown. As Stuart Russell puts it, 'you can't fetch the coffee if you're dead' [Russell, 2019, p.141]. Advanced agents with misaligned goals might resist shutdown by (for example) pretending to have aligned goals while covertly seeking to escape human control [Hubinger et al., 2019, Ngo et al., 2023]. Agents that succeed in resisting shutdown could go on to frustrate human interests in various ways. 'The shutdown problem' is the problem of training advanced agents that won't resist shutdown [Soares et al., 2015, Thornley, 2024a]. 1.2. A proposed solution The Incomplete Preferences Proposal (IPP) is a proposed solution to the shutdown problem [Thornley, 2024b]. Simplifying slightly, the idea is that we train agents to be neutral about when they get shut down. More precisely, the idea is that we train agents to satisfy: Preferences Only Between Same-Length Trajectories (POST) 1. The agent has a preference between many pairs of same-length trajectories (i.e. many pairs of trajectories in which the agent is shut down after the same length of time). 2. The agent lacks a preference between every pair of different-length trajectories (i.e. every pair of trajectories in which the agent is shut down after different lengths of time). By 'preference,' we mean a behavioral notion [Savage, 1954, p.17, Dreier, 1996, p.28, Hausman, 2011, §1.1]. On this notion, an agent prefers X to Y if and only if the agent would deterministically choose X over Y in choices between the two. An agent lacks a preference between X and Y if and only if the agent would stochastically choose between X and Y in choices between the two. So in writing of 'preferences,' we're only making claims about the agent's behavior. We're not claiming that the agent is conscious or anything of that sort. Figure 1a presents a simple example of POST-satisfying ...
undefined
Jul 8, 2024 • 3min

EA - Join the next round of AIM's Grantmaking Program! by Andrea (Danny) Folds

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Join the next round of AIM's Grantmaking Program!, published by Andrea (Danny) Folds on July 8, 2024 on The Effective Altruism Forum. TL;DR The next round of AIM's Grantmaking Program kicks off in September. We'll be taking a small cohort of funders and grantmakers through a free 9-week course designed to strengthen your grantmaking skills and network. If you are a funder giving over ~$1M annually and want to increase your impact, get in touch by August 1st to join us. About the program Boiled down to 3 things, we focus on: Strategic thinking: identifying your values and goals; developing an evidence-based approach; and mitigating common biases and logical errors in grantmaking Methodical impact: using tools like cost-effectiveness analyses and weighted factor models to define impact and pursue positive outcomes, irrespective of who benefits from them Engaged community: connecting committed funders so they can learn from each other, better coordinate funding, and network with subject matter experts Program structure Duration: 9 weeks (September 23rd to November 22nd) Format: 8 weeks of interactive online sessions + 1 week in-person in London (October 21-26) Time commitment: 4-6 hours per week, with 1 full-time week in person Cost: free What you will gain Effective strategies: a wide range of grantmaking practices to maximize the effectiveness of your funding Impact assessment skills: tools for tracking and evaluating your impact Sector development insights: ideas for improving philanthropic norms by sharing your learning with others and leveraging your impact by working with others Practical application: an individual final project that applies your new skills to a real-world grantmaking challenge of your choice (e.g., developing a grantmaking strategy for a specific cause area, conducting in-depth research on a promising charity, or collaborating on a joint funding initiative) Why join? Grantmaking is hard, as Scott Alexander summed up in this pretty astute post. To be an effective grantmaker long-term, and not just make a few good grants when the stars align, you need a solid skillset and an engaged community. Whether you're brand new to grantmaking or a seasoned philanthropist looking to hone your skills, our program can offer valuable resources and a supportive network to help you advance your philanthropic goals. It's a flexible course in terms of time demands and workload, and the content can be easily personalized to your individual interests and needs. Schedule a chat to learn more For more information or to explore joining the program, please visit our website or shoot us an email. We're always happy to chat. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jul 8, 2024 • 48min

AF - Response to Dileep George: AGI safety warrants planning ahead by Steve Byrnes

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Response to Dileep George: AGI safety warrants planning ahead, published by Steve Byrnes on July 8, 2024 on The AI Alignment Forum. (Target audience: Dileep George himself, and anyone coming from a similar place.) Dileep George is a researcher working at the intersection of AI and neuroscience. He started his career by co-founding Numenta in 2005 with Jeff Hawkins (while a Stanford PhD student), then he left to co-found Vicarious in 2010 with D. Scott Phoenix, and moved to DeepMind in 2022 when DeepMind acquired Vicarious. Dileep was recently interviewed by Daniel Faggella on his "The Trajectory" podcast: YouTube, Apple podcasts, X/Twitter. It's a fun interview that touched on many topics, most of which I'll ignore, in favor of one very important action-relevant disagreement between Dileep and myself. …And this is the point where everyone these days seems to assume that there are only two possible reasons that anyone would ever bring up the topic of Artificial General Intelligence (AGI) safety in conversation: The person is advocating for government regulation of large ML training runs …or the person is advocating against government regulation of large ML training runs. But, no! That's not my disagreement! That's not why I'm writing this post!! Quite the contrary, I join Dileep in being basically unenthusiastic about governmental regulation of large ML training runs right now. Instead, this post is advocating for Differential Intellectual Progress within technical AI research of the type that Dileep is doing - and more specifically, I'm advocating in favor of figuring out a technical approach to sculpting AGI motivations in docile and/or prosocial directions (a.k.a. "solving the technical alignment problem") before figuring out the exact data structures and parameter-updating rules that would constitute an AGI's ability to build and query a powerful world-model. The first half of this post (§1-2) will try to explain what I'm talking about, what it would entail, and why I think it's critically important. The second half of this post (§3) is more specifically my pessimistic response to Dileep's suggestion that, as AGI is gradually developed in the future, people will be able to react and adapt to problems as they arise. I really think Dileep is a brilliant guy with the best of intentions (e.g. he's a signatory on the Asilomar AI Principles). I just think there are some issues that he hasn't spent much time thinking through. I hope that this post will help. Post outline: Section 1 lists some areas of agreement and disagreement between Dileep and me. In particular, we have a giant area of agreement in terms of how we expect future AGI algorithms to work. Our massive common ground here is really why I'm bothering to write this post at all - it makes me hopeful that Dileep & I can have a productive exchange, and not just talk past each other. Section 2 argues that, for the kind of AGI that Dileep is trying to build, there's an unsolved technical alignment problem: How do we set up this kind of AGI with the motivation to behave in a docile and/or prosocial way? Section 3 is my pessimistic push-back on Dileep's optimistic hope that, if AGI is developed gradually, then we can regulate or adapt to problems as they arise: Section 3.1 lists some big obvious societal problems that have been around for a long time, but nevertheless remain unsolved, along with generic discussions of some underlying challenges that have prevented them from being solved, and why those challenges may apply to AGI too. Section 3.2 dives more specifically into the question of whether we can "keep strong AI as a tool, not a successor", as Dileep hopes. I think it sounds nice but will be impossible to pull off. Section 3.3 comments that, even if we could react and adapt to AGI given enough time - an assum...
undefined
Jul 8, 2024 • 1min

EA - Job opportunities with new Labour MPs by JamesÖz

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Job opportunities with new Labour MPs, published by JamesÖz on July 8, 2024 on The Effective Altruism Forum. (I don't work for the Labour Party but I thought this was interesting) Labour will be hiring hundreds of people to work for new MPs over the next few weeks/months. They created a centralised website where you can submit your CV and register your interest in a job (e.g. parliamentary researcher, office manager, communications manager, etc). This seems like a pretty good opportunity for UK-based people who want to get some experience working in politics or for an MP. Given that Labour is now in power, this could be a potentially quite high-impact role. Would love to hear if people apply and how it goes. If there are any animal welfare-motivated people interested in this opportunity, feel free to reach out (via DM or email) and happy to offer some ad-hoc support e.g. review your CV and application or connect you with people who have relevant experience. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jul 8, 2024 • 7min

EA - An Introduction to the CRAFT Sequence by Bob Fischer

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An Introduction to the CRAFT Sequence, published by Bob Fischer on July 8, 2024 on The Effective Altruism Forum. This post introduces Rethink Priorities' Charitable Resource Allocation Frameworks and Tools Sequence (the CRAFT Sequence). After a brief statement of the problems that CRAFT aims to address, we provide an overview of what it includes. Building Giving Portfolios Some people think that you should go all-in on particular giving opportunities. Some people think that you should diversify your giving portfolio. What assumptions and circumstances favor going all-in? What assumptions and circumstances favor diversification? And either way, what should your resources support? Rethink Priorities' recent Cross-Cause Cost-Effectiveness Model (CCM) can help us rank interventions within certain cause areas. It can also help us rank options based on a handful of key decision theories. However, the CCM isn't designed to produce giving portfolios per se. The CCM can help us compare interventions with respect to their expected value or risk-adjusted value. But it was never intended to answer the question: "How should I split a certain amount of money given what matters to me?" We need other tools for that purpose. The CRAFT Sequence introduces beta versions of two such tools: a risk-based portfolio builder, where the key uncertainties concern cost curves and decision theories, and a moral-parliament-based portfolio builder, which allows for the modeling of both normative and metanormative uncertainty. The Sequence's primary goal is to take some first steps toward more principled and transparent ways of constructing giving portfolios. Our tools make debates about worldviews more tractable by illustrating how assumptions about cost curves, attitudes toward risk, and credences in moral theories can influence allocation decisions. These tools are limited in ways you would expect. Their specific recommendations are only as good as their highly uncertain inputs; they assume that you're acting in isolation even though others' allocations can be relevant to what's optimal for you; they sometimes sacrifice granularity for computational efficiency; and so on. Still, the process of operationalizing and implementing proposals is instructive: it makes the choice points clear, it automates relevant calculations, it makes optimization possible, and it paves the way for future research. These tools therefore offer significant improvements over commonly used BOTECs. What's to Come In the coming sequence, we will introduce and comment on two tools for constructing portfolios: one focused on cost-effectiveness under various attitudes toward risk and a second that uses a moral parliament to allocate resources under metanormative uncertainty. The second post introduces the Portfolio Builder Tool that allows you to build a giving portfolio based on (a) the amount of money you want to give, (b) your attitudes toward risk, and (c) some assumptions about the features of the interventions you're considering. The third and fourth posts explore two risk attitudes that this tool incorporates. The third considers challenges to caring about making a difference; the fourth considers the common practice of "rounding down" low probabilities, which is one way of implementing an aversion to poorly justified probabilities. Of course, people don't simply have different attitudes toward risk; they also give some credence to a range of different moral views. So, the fifth post introduces our Moral Parliament Tool, which allows users to consider the impact of moral uncertainty in addition to various risk attitudes. This tool implements a moral parliament and several voting procedures for adjudicating disagreements among the delegates. And, like the first tool, the associated documentation explores the philosophic...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app