The Nonlinear Library

The Nonlinear Fund
undefined
May 23, 2024 • 2min

AF - Paper in Science: Managing extreme AI risks amid rapid progress by JanB

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Paper in Science: Managing extreme AI risks amid rapid progress, published by JanB on May 23, 2024 on The AI Alignment Forum. https://www.science.org/doi/10.1126/science.adn0117 Authors: Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila McIlraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner*, Sören Mindermann* Abstract: Artificial intelligence (AI) is progressing rapidly, and companies are shifting their focus to developing generalist AI systems that can autonomously act and pursue goals. Increases in capabilities and autonomy may soon massively amplify AI's impact, with risks that include large-scale social harms, malicious uses, and an irreversible loss of human control over autonomous AI systems. Although researchers have warned of extreme risks from AI, there is a lack of consensus about how to manage them. Society's response, despite promising first steps, is incommensurate with the possibility of rapid, transformative progress that is expected by many experts. AI safety research is lagging. Present governance initiatives lack the mechanisms and institutions to prevent misuse and recklessness and barely address autonomous systems. Drawing on lessons learned from other safety-critical technologies, we outline a comprehensive plan that combines technical research and development with proactive, adaptive governance mechanisms for a more commensurate preparation. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
undefined
May 23, 2024 • 8min

LW - "Which chains-of-thought was that faster than?" by Emrik

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Which chains-of-thought was that faster than?", published by Emrik on May 23, 2024 on LessWrong. Here's some good advice from Eliezer: TAP: "How could I have thought that faster?" WHEN[1] you complete a chain-of-thought THEN ask yourself, "how could I have thought that faster?" I really like this heuristic, and it's already paid its rent several times over for me. Most recently today, so I'll share the (slightly edited) cognitive trace of it as an example: Example: To find the inverse of something, trace the chain forward a few times first 1. I was in the context of having just asked myself "what's the set of functions which have this function as its derivative?" 2. This is of course its integral, but I didn't want to use cached abstractions, and instead sought to get a generalized view of the landscape from first-principles. 3. For about ~10 seconds, I tried to hold the function f in my mind while trying to directly generate the integral landscape from it. 4. This seemed awfwly inefficient, so I changed tack: I already know some specific functions whose derivatives equal f, so I held those as the proximal thing in my mind while retracing the cognitive steps involved in their derivation. 5. After making those steps more salient in the forward direction (integralderivative), it was easier to retrace the path in the opposite direction. 6. And once the derivativeintegral trace was salient for a few examples, it was easier to generalize from the examples to produce the landscape of all the integrals. 7. There are multiple takeaways here, but one is: 1. "If you struggle to generalize something, find a way to generate specific examples first, then generalize from the examples." TAP: "Which chains-of-thought was that faster than?" Imo, more important than asking "how could I have thought that faster?" is the inverse heuristic: WHEN you complete a good chain-of-thought THEN ask yourself, "which chains-of-thought was that faster than?" Although, ideally, I wouldn't scope the trigger to every time you complete a thought, since that overburdens the general cue. Instead, maybe limit it to those times when you have an especially clear trace of it AND you have a hunch that something about it was unusually good. WHEN you complete a good chain of thought AND you have its trace in short-term memory AND you hunch that something about it was unusually effective THEN ask yourself, "which chains-of-thought was that faster than?" Example: Sketching out my thoughts with pen-and-paper 1. Yesterday I was writing out some plans explicitly with pen and paper - enumerating my variables and drawing arrows between them. 2. I noticed - for the umpteenth time - that forcing myself to explicitly sketch out the problem (even with improvised visualizations) is far more cognitively ergonomic than keeping it in my head (see eg why you should write pseudocode). 3. But instead of just noting "yup, I should force myself to do more pen-and-paper", I asked myself two questions: 1. "When does it help me think, and when does it just slow me down?" 1. This part is important: scope your insight sharply to contexts where it's usefwl - hook your idea into the contexts where you want it triggered - so you avoid wasting memory-capacity on linking it up to useless stuff. 2. In other words, you want to minimize (unwanted) associative interference so you can remember stuff at lower cost. 3. My conclusion was that pen-and-paper is good when I'm trying to map complex relations between a handfwl of variables. 4. And it is NOT good when I have just a single proximal idea that I want to compare against a myriad of samples with high false-positive rate - that's instead where I should be doing inside-head thinking to exploit the brain's massively parallel distributed processor. 2. "Why am I so reluctant to do it?" 1. This se...
undefined
May 22, 2024 • 6min

EA - Survey: bioethicists' views on bioethical issues by Leah Pierson

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Survey: bioethicists' views on bioethical issues, published by Leah Pierson on May 22, 2024 on The Effective Altruism Forum. Summary Bioethicists influence practices and policies in medicine, science, and public health. However, little is known about bioethicists' views in aggregate. We recently surveyed 824 U.S bioethicists on a wide range of ethical issues, including several issues of interest to the EA community (e.g., compensating organ donors, priority setting, paternalistic regulations, and trade-offs between human and animal welfare, among others). We aimed to contact everyone who presented at the American Society for Bioethics and Humanities Annual Conference in 2021 or 2022 and/or is affiliated with a US bioethics training program. Of the 1,713 people contacted, 824 (48%) completed the survey. Why should EAs care? 1. As Devin Kalish puts it in this nice post: "Bioethics is the field of ethics that focuses on issues like pandemics, human enhancement, AI, global health, animal rights, and environmental ethics. Bioethicists, in short, have basically the same exact interests as us." 2. Many EAs don't hold the bioethics community in high regard. Much of this animus seems to stem from EAs' perception that bioethicists have bad takes. (See Devin's post for more on this.) Our survey casts light on bioethicists' views; people can update their opinions accordingly. What did we find? Chris Said of Apollo Surveys[1] separately analyzed our data and wrote a blog post summarizing our results: Primary results A large majority (87%) of bioethicists believed that abortion was ethically permissible. 82% thought it was permissible to select embryos based on somewhat painful medical conditions, whereas only 22% thought it was permissible to select on non-medical traits like eye color or height. 59% thought it was ethically permissible for clinicians to assist patients in ending their own lives. 15% of bioethicists thought it was ethically permissible to offer payment in exchange for organs (e.g. kidneys). Question 1 Please provide your opinion on whether the following actions are ethically permissible. Is abortion ethically permissible? Is it ethically permissible to select some embryos over others for gestation on the basis of somewhat painful medical conditions? Is it ethically permissible to make trade-offs between human welfare and non-human animal welfare? Is it ethically permissible for a clinician to treat a 14-year-old for opioid use disorder without their parents' knowledge or consent? Is it ethically permissible to offer payment in exchange for blood products? Is it ethically permissible to subject people to regulation they disagree with, solely for the sake of their own good? Is it ethically permissible for clinicians to assist patients in ending their own lives if they request this? Is it ethically permissible for a government to allow an individual to access treatments that have not been approved by regulatory agencies, but only risk harming that individual and not others? Is it ethically permissible to consider an individual's past decisions when determining their access to medical resources? Is it ethically permissible to select some embryos over others for gestation on the basis of non-medical traits (e.g., eye color, height)? Is it ethically permissible to offer payment in exchange for organs (e.g., kidneys)? Is it ethically permissible for decisional surrogates to make a medical decision that they believe is in a patient's best interest, even when that decision goes against the patient's previously stated preferences? Is it ethically permissible for a clinician to provide life-saving care to an adult patient who has refused that care and has decision-making capacity? Results Question 2 In general, should policymakers consider non-health benefits and harms (lik...
undefined
May 22, 2024 • 25min

LW - Do Not Mess With Scarlett Johansson by Zvi

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Do Not Mess With Scarlett Johansson, published by Zvi on May 22, 2024 on LessWrong. I repeat. Do not mess with Scarlett Johansson. You would think her movies, and her suit against Disney, would make this obvious. Apparently not so. Andrej Karpathy (co-founder OpenAI, departed earlier), May 14: The killer app of LLMs is Scarlett Johansson. You all thought it was math or something. You see, there was this voice they created for GPT-4o, called 'Sky.' People noticed it sounded suspiciously like Scarlett Johansson, who voiced the AI in the movie Her, which Sam Altman says is his favorite movie of all time, which he says inspired OpenAI 'more than a little bit,' and then he tweeted "Her" on its own right before the GPT-4o presentation, and which was the comparison point for many people reviewing the GPT-4o debut? Quite the Coincidence I mean, surely that couldn't have been intentional. Oh, no. Kylie Robison: I asked Mira Mutari about Scarlett Johansson-type voice in today's demo of GPT-4o. She clarified it's not designed to mimic her, and said someone in the audience asked this exact same question! Kylie Robison in Verge (May 13): Title: ChatGPT will be able to talk to you like Scarlett Johansson in Her. OpenAI reports on how it created and selected its five selected GPT-4o voices. OpenAI: We support the creative community and worked closely with the voice acting industry to ensure we took the right steps to cast ChatGPT's voices. Each actor receives compensation above top-of-market rates, and this will continue for as long as their voices are used in our products. We believe that AI voices should not deliberately mimic a celebrity's distinctive voice - Sky's voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice. To protect their privacy, we cannot share the names of our voice talents. … Looking ahead, you can expect even more options as we plan to introduce additional voices in ChatGPT to better match the diverse interests and preferences of users. Jessica Taylor: My "Sky's voice is not an imitation of Scarlett Johansson" T-shirt has people asking a lot of questions already answered by my shirt. OpenAI: We've heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them. Variety: Altman said in an interview last year that "Her" is his favorite movie. Variety: OpenAI Suspends ChatGPT Voice That Sounds Like Scarlett Johansson in 'Her': AI 'Should Not Deliberately Mimic a Celebrity's Distinctive Voice.' [WSJ had similar duplicative coverage.] Flowers from the Future: That's why we can't have nice things. People bore me. Again: Do not mess with Scarlett Johansson. She is Black Widow. She sued Disney. Several hours after compiling the above, I was happy to report that they did indeed mess with Scarlett Johansson. She is pissed. Bobby Allen (NPR): Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice. … Johansson's legal team has sent OpenAI two letters asking the company to detail the process by which it developed a voice the tech company dubbed "Sky," Johansson's publicist told NPR in a revelation that has not been previously reported. NPR then published her statement, which follows. Scarlett Johansson's Statement Scarlett Johansson: Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and Al. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, I declined the offer. Nine months later, my friends,...
undefined
May 22, 2024 • 9min

EA - Summary: Against the singularity hypothesis by Global Priorities Institute

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Summary: Against the singularity hypothesis, published by Global Priorities Institute on May 22, 2024 on The Effective Altruism Forum. This is a summary of the GPI Working Paper "Against the singularity hypothesis" by David Thorstad (published in Philosophical Studies). The summary was written by Riley Harris. The singularity is a hypothetical future event in which machines rapidly become significantly smarter than humans. The idea is that we might invent an artificial intelligence (AI) system that can improve itself. After a single round of self-improvement, that system would be better equipped to improve itself than before. This process might repeat many times, and each time the AI system would become more capable and better equipped to improve itself even further. At the end of this (perhaps very rapid) process, the AI system could be much smarter than the average human. Philosophers and computer scientists have thought we should take the possibility of a singularity seriously (Solomonoff 1985, Good 1996, Chalmers 2010, Bostrom 2014, Russell 2019). It is characteristic of the singularity hypothesis that AI will take years or months at the most to become many times more intelligent than even the most intelligent human.[1] Such extraordinary claims require extraordinary evidence. In the paper "Against the singularity hypothesis", David Thorstad claims that we do not have enough evidence to justify the belief in the singularity hypothesis, and we should consider it unlikely unless stronger evidence emerges. Reasons to think the singularity is unlikely Thorstad is sceptical that machine intelligence can grow quickly enough to justify the singularity hypothesis. He gives several reasons for this. Low-hanging fruit. Innovative ideas and technological improvements tend to become more difficult over time. For example, consider "Moore's law", which is (roughly) the observation that hardware capacities double every two years. Between 1971 and 2014 Moore's law was maintained only with an astronomical increase in the amount of capital and labour invested into semiconductor research (Bloom et al. 2020). In fact, according to one leading estimate, there was an eighteen-fold drop in productivity over this period. While some features of future AI systems will allow them to increase the rate of progress compared to human scientists and engineers, they are still likely to experience diminishing returns as the easiest discoveries have already been made and only more difficult ideas are left. Bottlenecks. AI progress relies on improvements in search, computation, storage and so on (each of these areas breaks down into many subcomponents). Progress could be slowed down by any of these subcomponents: if any of these are difficult to speed up, then AI progress will be much slower than we would naively expect. The classic metaphor here concerns the speed a liquid can exit a bottle, which is rate-limited by the narrow space near the opening. AI systems may run into bottlenecks if any essential components cannot be improved quickly (see Aghion et al., 2019). Constraints. Resource and physical constraints may also limit the rate of progress. To take an analogy, Moore's law gets more difficult to maintain because it is expensive, physically difficult and energy-intensive to cram ever more transistors in the same space. Here we might expect progress to eventually slow as physical and financial constraints provide ever greater barriers to maintaining progress. Sublinear growth. How do improvements in hardware translate to intelligence growth? Thompson and colleagues (2022) find that exponential hardware improvements translate to linear gains in performance on problems such as Chess, Go, protein folding, weather prediction and the modelling of underground oil reservoirs. Over the past 50 years,...
undefined
May 22, 2024 • 17min

EA - Scorable Functions: A Format for Algorithmic Forecasting by Ozzie Gooen

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Scorable Functions: A Format for Algorithmic Forecasting, published by Ozzie Gooen on May 22, 2024 on The Effective Altruism Forum. Introduction Imagine if a forecasting platform had estimates for things like: 1. "For every year until 2100, what will be the probability of a global catastrophic biological event, given different levels of biosecurity investment and technological advancement?" 2. "What will be the impact of various AI governance policies on the likelihood of developing safe and beneficial artificial general intelligence, and how will this affect key indicators of global well-being over the next century?" 3. "How valuable is every single project funded by Open Philanthropy, according to a person with any set of demographic information, if they would spend 1000 hours reflecting on it?" These complex, multidimensional questions are useful for informing decision-making and resource allocation around effective altruism and existential risk mitigation. However, traditional judgemental forecasting methods often struggle to capture the nuance and conditionality required to address such questions effectively. This is where "scorable functions" come in - a forecasting format that allows forecasters to directly submit entire predictive models rather than just point estimates or simple probability distributions. Scorable functions allow encoding a vast range of relationships and dependencies, from basic linear trends to intricate nonlinear dynamics. Forecasters can precisely specify interactions between variables, the evolution of probabilities over time, and how different scenarios could unfold. At their core, scorable functions are executable models that output probabilistic predictions and can be directly scored via function calls. They encapsulate the forecasting logic, whether it stems from human judgment, data-driven insights, or a hybrid of the two. Scorable functions can span from concise one-liners to elaborate constructs like neural networks. Over the past few years, we at QURI have been investigating how to effectively harness these methods. We believe scorable functions could be a key piece of the forecasting puzzle going forward. From Forecast Bots to Scorable Functions Many people are familiar with the idea of using "bots" to automate forecasts on platforms like Metaculus. Let's consider a simple example to see how scorable functions can extend this concept. Suppose there's a binary question on Metaculus: "Will event X happen in 2024?" Intuitively, the probability should decrease as 2024 progresses, assuming no resolution. A forecaster might start at 90% in January, but want to gradually decrease to 10% by December. One approach is to manually update the forecast each week - a tedious process. A more efficient solution is to write a bot that submits forecasts based on a simple function: (Example using Squiggle, but hopefully it's straightforward enough) This bot can automatically submit daily forecasts via the Metaculus API. However, while more efficient than manual updates, this approach has several drawbacks: 1. The platform must store and process a separate forecast for each day, even though they all derive from a simple function. 2. Viewers can't see the full forecast trajectory, only the discrete submissions. 3. The forecaster's future projections and scenario contingencies are opaque. Scorable functions elegantly solve these issues. Instead of a bot submitting individual forecasts, the forecaster simply submits the generating function itself. You can imagine there being a custom input box directly in Metaculus. The function submitted would be the same, though it might be provided as a lambda function or with a standardized function name. The platform can then evaluate this function on-demand to generate up-to-date forecasts. Viewers see the comp...
undefined
May 22, 2024 • 6min

LW - Anthropic announces interpretability advances. How much does this advance alignment? by Seth Herd

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Anthropic announces interpretability advances. How much does this advance alignment?, published by Seth Herd on May 22, 2024 on LessWrong. Anthropic just published a pretty impressive set of results in interpretability. This raises for me, some questions and a concern: Interpretability helps, but it isn't alignment, right? It seems to me as though the vast bulk of alignment funding is now going to interpretability. Who is thinking about how to leverage interpretability into alignment? It intuitively seems as though we are better off the more we understand the cognition of foundation models. I think this is true, but there are sharp limits: it will be impossible to track the full cognition of an AGI, and simply knowing what it's thinking about will be inadequate to know whether it's making plans you like. One can think about bioweapons, for instance, to either produce them or prevent producing them. More on these at the end; first a brief summary of their results. In this work, they located interpretable features in Claude 3 Sonnet using sparse autoencoders, and manipulating model behavior using those features as steering vectors. They find features for subtle concepts; they highlight features for: The Golden Gate Bridge 34M/31164353: Descriptions of or references to the Golden Gate Bridge. Brain sciences 34M/9493533: discussions of neuroscience and related academic research on brains or minds. Monuments and popular tourist attractions 1M/887839. Transit infrastructure 1M/3. [links to examples] ... We also find more abstract features - responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets. ...we found features corresponding to: Capabilities with misuse potential (code backdoors, developing biological weapons) Different forms of bias (gender discrimination, racist claims about crime) Potentially problematic AI behaviors (power-seeking, manipulation, secrecy) Presumably, the existence of such features will surprise nobody who's used and thought about large language models. It is difficult to imagine how they would do what they do without using representations of subtle and abstract concepts. They used the dictionary learning approach, and found distributed representations of features: Our general approach to understanding Claude 3 Sonnet is based on the linear representation hypothesis and the superposition hypothesis from the publication, Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. Or to put it more plainly: It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts. Representations in the brain definitely follow that description, and the structure of representations seems pretty similar as far as we can guess from animal studies and limited data on human language use. They also include a fascinating image of near neighbors to the feature for internal conflict (see header image). So, back to the broader question: it is clear how this type of interpretability helps with AI safety: being able to monitor when it's activating features for things like bioweapons, and use those features as steering vectors, can help control the model's behavior. It is not clear to me how this generalizes to AGI. And I am concerned that too few of us are thinking about this. It seems pretty apparent how detecting lying will dramatically help in pretty much any conceivable plan for technical alignment of AGI. But it seems like being able to monitor an entire thought process of a being smarter than us is impossible on the face of it. I think the hope is that we can detect and monitor cognition that is about dangerous topics, so we don't need to follow its full train of thought. If we can tell what an AGI is thinking ...
undefined
May 22, 2024 • 3min

AF - Announcing Human-aligned AI Summer School by Jan Kulveit

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Announcing Human-aligned AI Summer School, published by Jan Kulveit on May 22, 2024 on The AI Alignment Forum. The fourth Human-aligned AI Summer School will be held in Prague from 17th to 20th July 2024. We will meet for four intensive days of talks, workshops, and discussions covering latest trends in AI alignment research and broader framings of AI alignment research. Apply now , applications are evaluated on a rolling basis. The intended audience of the school are people interested in learning more about the AI alignment topics, PhD students, researchers working in ML/AI outside academia, and talented students. Format of the school The school is focused on teaching and exploring approaches and frameworks, less on presentation of the latest research results. The content of the school is mostly technical - it is assumed the attendees understand current ML approaches and some of the underlying theoretical frameworks. This year, the school will cover these main topics: Overview of the alignment problem and current approaches. Alignment of large language models: RLHF, DPO and beyond. Methods used to align current large language models and their shortcomings. Evaluating and measuring AI systems: How to understand and oversee current AI systems on the behavioral level. Interpretability and the science of deep learning: What's going on inside of the models? AI alignment theory: While 'prosaic' approaches to alignment focus on current systems, theory aims for deeper understanding and better generalizability. Alignment in the context of complex systems and multi-agent settings: What should the AI be aligned to? In most realistic settings, we can expect there are multiple stakeholders and many interacting AI systems; any solutions to alignment problem need to solve multi-agent settings. The school consists of lectures and topical series, focused smaller-group workshops and discussions, expert panels, and opportunities for networking, project brainstorming and informal discussions. Detailed program of the school will be announced shortly before the event. See below for a program outline and e.g. the program of the previous school for an illustration of the program content and structure. Confirmed speakers Stephen Casper - Algorithmic Alignment Group, MIT. Stanislav Fort - Google DeepMind. Jesse Hoogland - Timaeus. Jan Kulveit - Alignment of Complex Systems, Charles University. Mary Phuong - Google DeepMind. Deger Turan - AI Objectives Institute and Metaculus. Vikrant Varma - Google DeepMind. Neel Nanda - Google DeepMind. (more to be announced later) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
undefined
May 22, 2024 • 2min

EA - The Charity Commission has concluded its inquiry into Effective Ventures Foundation UK by Rob Gledhill

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Charity Commission has concluded its inquiry into Effective Ventures Foundation UK, published by Rob Gledhill on May 22, 2024 on The Effective Altruism Forum. The Charity Commission for England and Wales has concluded its statutory inquiry into Effective Ventures Foundation UK (EVF UK), which was originally launched in 2023 following the collapse of FTX. The full report on the inquiry can be found here, and the Commission's press release on the inquiry can be found here. The inquiry's scope was to examine: The extent of any risk to EVF's assets. The extent to which the trustees were complying with their legal obligations to protect the charity's property The governance and administration of the charity by the trustees.[1] We are pleased that "the inquiry found that the trustees took appropriate steps to protect the charity's funds and complied with their legal duties acting diligently and quickly following the collapse of FTX." The Commission's report notes the full cooperation of EVF's trustees and that they "sought to act in the charity's best interests." Although the Commission noted that there had been a "lack of clarity" around historical conflicts of interest and a lack of formal process for identifying conflicts of interest, "in practice no issues arose" and "there is no evidence to suggest that there were any unmanaged conflicts of interest regarding funds the charity received from the FTX Foundation or that any trustee had acted in a way contrary to the interests of the charity." They also note that subsequent to FTX's collapse, "Both the finance and legal teams at the charity have been strengthened and policies have been bolstered or created with more robust frameworks." I'm pleased that the charity commission recognises the improvements that have been made at EV. This report doesn't change EV's strategy to decentralise, as previously announced here. 1. ^ For further context, the Charity Commission is a regulator in the UK whose responsibilities include: preventing mismanagement and misconduct by charities; promoting compliance with charity law; protecting the property, beneficiaries, and work of charities; and safeguarding the public's trust and confidence in charities. A statutory inquiry is a tool for the Commission to establish facts and collect evidence related to these responsibilities. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
May 22, 2024 • 3min

EA - A tale of two Sams by Geoffrey Miller

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A tale of two Sams, published by Geoffrey Miller on May 22, 2024 on The Effective Altruism Forum. After Sam Bankman-Fried proved to be a sociopathic fraudster and a massive embarrassment to EA, we did much soul-searching about what EAs did wrong, in failing to detect and denounce his sociopathic traits. We spent, collectively, thousands of hours ruminating about what we can do better, next time we encounter an unprincipled leader who acts like it's OK to abuse and betray people to pursue their grandiose vision, who gets caught up in runaway greed for wealth and power, who violates core EA values, and who threatens the long-term flourishing of sentient beings. Well, that time is now. Sam Altman at OpenAI has been proving himself, again and again, in many different domains and issues, to be a manipulative, deceptive, unwise, and arrogant leader, driven by hubris to build AGI as fast as possible, with no serious concern about the extinction risks he's imposing on us all. We are all familiar with the recent controversies and scandals at OpenAI, from the boardroom coup, to the mass violations of intellectual property in training LLMs, to the collapse of the Superalignment Team, to the draconian Non-Disparagement Agreements, to the new Scarlett Johansson voice emulation scandal this week. The evidence for Sam Altman being a Bad Actor seems, IMHO, at least as compelling as the evidence for Sam Bankman-Fried being a Bad Actor before the FTX collapse in Nov 2022. And the stakes are much, much higher for humanity (if not for EA's reputation). So what are we going to do about it? Should we keep encouraging young talented EAs to go work in the AI industry, in the hopes that they can nudge the AI companies from the inside towards safe AGI alignment -- despite the fact that many of them end up quitting, disillusioned and frustrated? Should we keep making excuses for OpenAI, and Anthropic, and DeepMind, pursuing AGI at recklessly high speed, despite the fact that AI capabilities research is far out-pacing AI safety and alignment research? Should we keep offering the public the hope that 'AI alignment' is a solvable problem, when we have no evidence that aligning AGIs with 'human values' would be any easier than aligning Palestinians with Israeli values, or aligning libertarian atheists with Russian Orthodox values -- or even aligning Gen Z with Gen X values? I don't know. But if we feel any culpability or embarrassment about the SBF/FTX debacle, I think we should do some hard thinking about how to deal with the OpenAI debacle. Many of us work on AI safety, and are concerned about extinction risks. I worry that all of our efforts in these directions could be derailed by a failure to call out the second rich, influential, pseudo-EA, sociopathic Sam that we've learned about in the last two years. If OpenAI 'succeeds' in developing AGI within a few years, long before we have any idea how to control AGI, that could be game over for our species. Especially if Sam Altman and his supporters and sycophants are still running OpenAI. [Epistemic note: I've written this hastily, bluntly, with emotion, because I think there's some urgency to EA addressing these issues.] Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app