The Nonlinear Library cover image

The Nonlinear Library

Latest episodes

undefined
Sep 13, 2024 • 41min

AF - Estimating Tail Risk in Neural Networks by Jacob Hilton

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Estimating Tail Risk in Neural Networks, published by Jacob Hilton on September 13, 2024 on The AI Alignment Forum. Machine learning systems are typically trained to maximize average-case performance. However, this method of training can fail to meaningfully control the probability of tail events that might cause significant harm. For instance, while an artificial intelligence (AI) assistant may be generally safe, it would be catastrophic if it ever suggested an action that resulted in unnecessary large-scale harm. Current techniques for estimating the probability of tail events are based on finding inputs on which an AI behaves catastrophically. Since the input space is so large, it might be prohibitive to search through it thoroughly enough to detect all potential catastrophic behavior. As a result, these techniques cannot be used to produce AI systems that we are confident will never behave catastrophically. We are excited about techniques to estimate the probability of tail events that do not rely on finding inputs on which an AI behaves badly, and can thus detect a broader range of catastrophic behavior. We think developing such techniques is an exciting problem to work on to reduce the risk posed by advanced AI systems: Estimating tail risk is a conceptually straightforward problem with relatively objective success criteria; we are predicting something mathematically well-defined, unlike instances of eliciting latent knowledge (ELK) where we are predicting an informal concept like "diamond". Improved methods for estimating tail risk could reduce risk from a variety of sources, including central misalignment risks like deceptive alignment. Improvements to current methods can be found both by doing empirical research, or by thinking about the problem from a theoretical angle. This document will discuss the problem of estimating the probability of tail events and explore estimation strategies that do not rely on finding inputs on which an AI behaves badly. In particular, we will: Introduce a toy scenario about an AI engineering assistant for which we want to estimate the probability of a catastrophic tail event. Explain some deficiencies of adversarial training, the most common method for reducing risk in contemporary AI systems. Discuss deceptive alignment as a particularly dangerous case in which adversarial training might fail. Present methods for estimating the probability of tail events in neural network behavior that do not rely on evaluating behavior on concrete inputs. Conclude with a discussion of why we are excited about work aimed at improving estimates of the probability of tail events. This document describes joint research done with Jacob Hilton, Victor Lecomte, David Matolcsi, Eric Neyman, Thomas Read, George Robinson, and Gabe Wu. Thanks additionally to Ajeya Cotra, Lukas Finnveden, and Erik Jenner for helpful comments and suggestions. A Toy Scenario Consider a powerful AI engineering assistant. Write M for this AI system, and M(x) for the action it suggests given some project description x. We want to use this system to help with various engineering projects, but would like it to never suggest an action that results in large-scale harm, e.g. creating a doomsday device. In general, we define a behavior as catastrophic if it must never occur in the real world.[1] An input is catastrophic if it would lead to catastrophic behavior. Assume we can construct a catastrophe detector C that tells us if an action M(x) will result in large-scale harm. For the purposes of this example, we will assume both that C has a reasonable chance of catching all catastrophes and that it is feasible to find a useful engineering assistant M that never triggers C (see Catastrophe Detectors for further discussion). We will also assume we can use C to train M, but that it is ...
undefined
Sep 13, 2024 • 12min

AF - Can startups be impactful in AI safety? by Esben Kran

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Can startups be impactful in AI safety?, published by Esben Kran on September 13, 2024 on The AI Alignment Forum. With Lakera's strides in securing LLM APIs, Goodfire AI's path to scaling interpretability, and 20+ model evaluations startups among much else, there's a rising number of technical startups attempting to secure the model ecosystem. Of course, they have varying levels of impact on superintelligence containment and security and even with these companies, there's a lot of potential for aligned, ambitious and high-impact startups within the ecosystem. This point isn't new and has been made in our previous posts and by Eric Ho (Goodfire AI CEO). To set the stage, our belief is that these are the types of companies that will have a positive impact: Startups with a profit incentive completely aligned with improving AI safety; that have a deep technical background to shape AGI deployment and; do not try to compete with AGI labs. Piloting AI safety startups To understand impactful technical AI safety startups better, Apart Research joined forces with collaborators from Juniper Ventures, vectorview (alumni from the latest YC cohort), Rudolf (from the upcoming def/acc cohort), Tangentic AI, and others. We then invited researchers, engineers, and students to resolve a key question "can we come up with ideas that scale AI safety into impactful for-profits?" The hackathon took place during a weekend two weeks ago with a keynote by Esben Kran (co-director of Apart) along with 'HackTalks' by Rudolf Laine (def/acc) and Lukas Petersson (YC / vectorview). Individual submissions were a 4 page report with the problem statement, why this solution will work, what the key risks of said solution are, and any experiments or demonstrations of the solution the team made. This post details the top 6 projects and excludes 2 projects that were made private by request (hopefully turning into impactful startups now!). In total, we had 101 signups and 11 final entries. Winners were decided by an LME model conditioned on reviewer bias. Watch the authors' lightning talks here. Dark Forest: Making the web more trustworthy with third-party content verification By Mustafa Yasir (AI for Cyber Defense Research Centre, Alan Turing Institute) Abstract: 'DarkForest is a pioneering Human Content Verification System (HCVS) designed to safeguard the authenticity of online spaces in the face of increasing AI-generated content. By leveraging graph-based reinforcement learning and blockchain technology, DarkForest proposes a novel approach to safeguarding the authentic and humane web. We aim to become the vanguard in the arms race between AI-generated content and human-centric online spaces.' Content verification workflow supported by graph-based RL agents deciding verifications Reviewer comments: Natalia: Well explained problem with clear need addressed. I love that you included the content creation process - although you don't explicitly address how you would attract content creators to use your platform over others in their process. Perhaps exploring what features of platforms drive creators to each might help you make a compelling case for using yours beyond the verification capabilities. I would have also liked to see more details on how the verification decision is made and how accurate this is on existing datasets. Nick: There's a lot of valuable stuff in here regarding content moderation and identity verification. I'd narrow it to one problem-solution pair (e.g., "jobs to be done") and focus more on risks around early product validation (deep interviews with a range of potential users and buyers regarding value) and go-to-market. It might also be worth checking out Musubi. Read the full project here. Simulation Operators: An annotation operation for alignment of robot By Ardy Haroen (USC) Abstrac...
undefined
Sep 13, 2024 • 15min

LW - The Great Data Integration Schlep by sarahconstantin

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Great Data Integration Schlep, published by sarahconstantin on September 13, 2024 on LessWrong. This is a little rant I like to give, because it's something I learned on the job that I've never seen written up explicitly. There are a bunch of buzzwords floating around regarding computer technology in an industrial or manufacturing context: "digital transformation", "the Fourth Industrial Revolution", "Industrial Internet of Things". What do those things really mean? Do they mean anything at all? The answer is yes, and what they mean is the process of putting all of a company's data on computers so it can be analyzed. This is the prerequisite to any kind of "AI" or even basic statistical analysis of that data; before you can start applying your fancy algorithms, you need to get that data in one place, in a tabular format. Wait, They Haven't Done That Yet? In a manufacturing context, a lot of important data is not on computers. Some data is not digitized at all, but literally on paper: lab notebooks, QA reports, work orders, etc. Other data is is "barely digitized", in the form of scanned PDFs of those documents. Fine for keeping records, but impossible to search, or analyze statistically. (A major aerospace manufacturer, from what I heard, kept all of the results of airplane quality tests in the form of scanned handwritten PDFs of filled-out forms. Imagine trying to compile trends in quality performance!) Still other data is siloed inside machines on the factory floor. Modern, automated machinery can generate lots of data - sensor measurements, logs of actuator movements and changes in process settings - but that data is literally stored in that machine, and only that machine. Manufacturing process engineers, for nearly a hundred years, have been using data to inform how a factory operates, generally using a framework known as statistical process control. However, in practice, much more data is generated and collected than is actually used. Only a few process variables get tracked, optimized, and/or used as inputs to adjust production processes; the rest are "data exhaust", to be ignored and maybe deleted. In principle the "excess" data may be relevant to the facility's performance, but nobody knows how, and they're not equipped to find out. This is why manufacturing/industrial companies will often be skeptical about proposals to "use AI" to optimize their operations. To "use AI", you need to build a model around a big dataset. And they don't have that dataset. You cannot, in general, assume it is possible to go into a factory and find a single dataset that is "all the process logs from all the machines, end to end". Moreover, even when that dataset does exist, there often won't be even the most basic built-in tools to analyze it. In an unusually modern manufacturing startup, the M.O. might be "export the dataset as .csv and use Excel to run basic statistics on it." Why Data Integration Is Hard In order to get a nice standardized dataset that you can "do AI to" (or even "do basic statistics/data analysis to") you need to: 1. obtain the data 2. digitize the data (if relevant) 3. standardize/ "clean" the data 4. set up computational infrastructure to store, query, and serve the data Data Access Negotiation, AKA Please Let Me Do The Work You Paid Me For Obtaining the data is a hard human problem. That is, people don't want to give it to you. When you're a software vendor to a large company, it's not at all unusual for it to be easier to make a multi-million dollar sale than to get the data access necessary to actually deliver the finished software tool. Why? Partly, this is due to security concerns. There will typically be strict IT policies about what data can be shared with outsiders, and what types of network permissions are kosher. For instance, in the semiconduc...
undefined
Sep 13, 2024 • 14min

LW - AI, centralization, and the One Ring by owencb

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI, centralization, and the One Ring, published by owencb on September 13, 2024 on LessWrong. People thinking about the future of AI sometimes talk about a single project 'getting there first' - achieving AGI, and leveraging this into a decisive strategic advantage over the rest of the world. I claim we should be worried about this scenario. That doesn't necessarily mean we should try to stop it. Maybe it's inevitable; or maybe it's the best available option. But I think that there are some pretty serious reasons for concern. At minimum, it seems important to stay in touch with those. In some ways, I think a single successful AGI project would be analogous to the creation of the One Ring. In The Lord of the Rings, Sauron had forged the One Ring, an artifact powerful enough to gain control of the rest of the world. While he was stopped, the Ring itself continued to serve as a source of temptation and corruption to those who would wield its power. Similarly, a centralized AGI project might gain enormous power relative to the rest of the world; I think we should worry about the corrupting effects of this kind of power. Forging the One Ring was evil Of course, in the story we are told that the Enemy made the Ring, and that he was going to use it for evil ends; and so of course it was evil. But I don't think that's the whole reason that forging the Ring was bad. I think there's something which common-sense morality might term evil about a project which accumulates enough power to take over the world. No matter its intentions, it is deeply and perhaps abruptly disempowering to the rest of the world. All the other actors - countries, organizations, and individuals - have the rug pulled out from under them. Now, depending on what is done with the power, many of those actors may end up happy about it. But there would still, I believe, be something illegitimate/bad about this process. So there are reasons to refrain from it[1]. In contrast, I think there is something deeply legitimate about sharing your values in a cooperative way and hoping to get others on board with that. And by the standards of our society, it is also legitimate to just accumulate money by selling goods or services to others, in order that your values get a larger slice of the pie. What if the AGI project is not run by a single company or even a single country, but by a large international coalition of nations? I think that this is better, but may still be tarred with some illegitimacy, if it doesn't have proper buy-in (and ideally oversight) from the citizenry. And buy-in from the citizenry seems hard to get if this is occurring early in a fast AI takeoff. Perhaps it is more plausible in a slow takeoff, or far enough through that the process itself could be helped by AI. Of course, people may have tough decisions to make, and elements of illegitimacy may not be reason enough to refrain from a path. But they're at least worth attending to. The difficulty of using the One Ring for good In The Lord of the Rings, there is a recurring idea that attempts to use the One Ring for good would become twisted, and ultimately serve evil. Here the narrative is that the Ring itself would exert influence, and being an object of evil, that would further evil. I wouldn't take this narrative too literally. I think powerful AI could be used to do a tremendous amount of good, and there is nothing inherent in the technology which will make its applications evil. Again, though, I am wary of having the power too centralized. If one centralized organization controls the One Ring, then everyone else lives at their sufferance. This may be bad, even if that organization acts in benevolent ways - just as it is bad for someone to be a slave, even with a benevolent master[2]. Similarly, if the state is too strong relative to its citize...
undefined
Sep 13, 2024 • 40min

AF - How difficult is AI Alignment? by Samuel Dylan Martin

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How difficult is AI Alignment?, published by Samuel Dylan Martin on September 13, 2024 on The AI Alignment Forum. This work was funded by Polaris Ventures There is currently no consensus on how difficult the AI alignment problem is. We have yet to encounter any real-world, in the wild instances of the most concerning threat models, like deceptive misalignment. However, there are compelling theoretical arguments which suggest these failures will arise eventually. Will current alignment methods accidentally train deceptive, power-seeking AIs that appear aligned, or not? We must make decisions about which techniques to avoid and which are safe despite not having a clear answer to this question. To this end, a year ago, we introduced the AI alignment difficulty scale, a framework for understanding the increasing challenges of aligning artificial intelligence systems with human values. This follow-up article revisits our original scale, exploring how our understanding of alignment difficulty has evolved and what new insights we've gained. This article will explore three main themes that have emerged as central to our understanding: 1. The Escalation of Alignment Challenges: We'll examine how alignment difficulties increase as we go up the scale, from simple reward hacking to complex scenarios involving deception and gradient hacking. Through concrete examples, we'll illustrate these shifting challenges and why they demand increasingly advanced solutions. These examples will illustrate what observations we should expect to see "in the wild" at different levels, which might change our minds about how easy or difficult alignment is. 2. Dynamics Across the Difficulty Spectrum: We'll explore the factors that change as we progress up the scale, including the increasing difficulty of verifying alignment, the growing disconnect between alignment and capabilities research, and the critical question of which research efforts are net positive or negative in light of these challenges. 3. Defining and Measuring Alignment Difficulty: We'll tackle the complex task of precisely defining "alignment difficulty," breaking down the technical, practical, and other factors that contribute to the alignment problem. This analysis will help us better understand the nature of the problem we're trying to solve and what factors contribute to it. The Scale The high level of the alignment problem, provided in the previous post, was: "The alignment problem" is the problem of aligning sufficiently powerful AI systems, such that we can be confident they will be able to reduce the risks posed by misused or unaligned AI systems We previously introduced the AI alignment difficulty scale, with 10 levels that map out the increasing challenges. The scale ranges from "alignment by default" to theoretical impossibility, with each level representing more complex scenarios requiring more advanced solutions. It is reproduced here: Alignment Difficulty Scale Difficulty Level Alignment technique X is sufficient Description Key Sources of risk 1 (Strong) Alignment by Default As we scale up AI models without instructing or training them for specific risky behaviour or imposing problematic and clearly bad goals (like 'unconditionally make money'), they do not pose significant risks. Even superhuman systems basically do the commonsense version of what external rewards (if RL) or language instructions (if LLM) imply. Misuse and/or recklessness with training objectives. RL of powerful models towards badly specified or antisocial objectives is still possible, including accidentally through poor oversight, recklessness or structural factors. 2 Reinforcement Learning from Human Feedback We need to ensure that the AI behaves well even in edge cases by guiding it more carefully using human feedback in a wide range of situations...
undefined
Sep 13, 2024 • 2min

EA - Farming groups and veterinarians submit amicus briefs against cruelty to chickens by Sage Max

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Farming groups and veterinarians submit amicus briefs against cruelty to chickens, published by Sage Max on September 13, 2024 on The Effective Altruism Forum. Yesterday in the North Carolina Court of Appeals in Raleigh, three amicus briefs were offered in support of Legal Impact for Chickens' lawsuit against KFC-supplier Case Farms for cruelty toward newborn chicks. These amicus briefs were submitted by farming groups, nonprofits, and veterinarians-including the Northeast Organic Dairy Producers Alliance and Asheville-based Dr. Laura Cochrane, DVM-and written by attorneys including former North Carolina Appellate Judge Lucy Inman. "Defendants' alleged conduct is not only unethical, but completely contrary to the professional standards of modern poultry farming," say Food Animal Concerns Trust, Northeast Organic Dairy Producers Alliance, and The Cornucopia Institute in their brief, written by former North Carolina Appellate Judge Lucy Inman. Beautiful Together Animal Sanctuary's brief states, "Legal Impact for Chicken's complaint alleges shocking atrocities that, if committed against a dog or cat, would merit universal condemnation." "North Carolina takes animal cruelty seriously," states brief by veterinarians Dr. Laura Cochrane and Dr. Martha Smith-Blackmore, and DEGA Mobile Veterinary Care, a nonprofit that helps North Carolina pets with low- or no-income owners. You can learn more about the lawsuit here: legalimpactforchickens.org/case-farms Thank you for your time and attention to these important animal-welfare issues! Sage & Legal Impact for Chickens Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Sep 13, 2024 • 18min

LW - Open Problems in AIXI Agent Foundations by Cole Wyeth

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Open Problems in AIXI Agent Foundations, published by Cole Wyeth on September 13, 2024 on LessWrong. I believe that the theoretical foundations of the AIXI agent and variations are a surprisingly neglected and high leverage approach to agent foundations research. Though discussion of AIXI is pretty ubiquitous in A.I. safety spaces, underscoring AIXI's usefulness as a model of superintelligence, this is usually limited to poorly justified verbal claims about its behavior which are sometimes questionable or wrong. This includes, in my opinion, a serious exaggeration of AIXI's flaws. For instance, in a recent post I proposed a simple extension of AIXI off-policy that seems to solve the anvil problem in practice - in fact, in my opinion it has never been convincingly argued that the anvil problem would occur for an AIXI approximation. The perception that AIXI fails as an embedded agent seems to be one of the reasons it is often dismissed with a cursory link to some informal discussion. However, I think AIXI research provides a more concrete and justified model of superintelligence than most subfields of agent foundations [1]. In particular, a Bayesian superintelligence must optimize some utility function using a rich prior, requiring at least structural similarity to AIXI. I think a precise understanding of how to represent this utility function may be a necessary part of any alignment scheme on pain of wireheading. And this will likely come down to understanding some variant of AIXI, at least if my central load bearing claim is true: The most direct route to understanding real superintelligent systems is by analyzing agents similar to AIXI. Though AIXI itself is not a perfect model of embedded superintelligence, it is perhaps the simplest member of a family of models rich enough to elucidate the necessary problems and exhibit the important structure. Just as the Riemann integral is an important precursor of Lebesgue integration, despite qualitative differences, it would make no sense to throw AIXI out and start anew without rigorously understanding the limits of the model. And there are already variants of AIXI that surpass some of those limits, such as the reflective version that can represent other agents as powerful as itself. This matters because the theoretical underpinnings of AIXI are still very spotty and contain many tractable open problems. In this document, I will collect several of them that I find most important - and in many cases am actively pursuing as part of my PhD research advised by Ming Li and Marcus Hutter. The AIXI (~= "universal artificial intelligence") research community is small enough that I am willing to post many of the directions I think are important publicly; in exchange I would appreciate a heads-up from anyone who reads a problem on this list and decides to work on it, so that we don't duplicate efforts (I am also open to collaborate). The list is particularly tilted towards those problems with clear, tractable relevance to alignment OR philosophical relevance to human rationality. Naturally, most problems are mathematical. Particularly where they intersect recursion theory, these problems may have solutions in the mathematical literature I am not aware of (keep in mind that I am a lowly second year PhD student). Expect a scattering of experimental problems to be interspersed as well. To save time, I will assume that the reader has a copy of Jan Leike's PhD thesis on hand. In my opinion, he has made much of the existing foundational progress since Marcus Hutter invented the model. Also, I will sometimes refer to the two foundational books on AIXI as UAI = Universal Artificial Intelligence and Intro to UAI = An Introduction to Universal Artificial Intelligence, and the canonical textbook on algorithmic information theory Intro to K = An...
undefined
Sep 13, 2024 • 9min

LW - How to Give in to Threats (without incentivizing them) by Mikhail Samin

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to Give in to Threats (without incentivizing them), published by Mikhail Samin on September 13, 2024 on LessWrong. TL;DR: using a simple mixed strategy, LDT can give in to threats, ultimatums, and commitments - while incentivizing cooperation and fair[1] splits instead. This strategy made it much more intuitive to many people I've talked to that smart agents probably won't do weird everyone's-utility-eating things like threatening each other or participating in commitment races. 1. The Ultimatum game This part is taken from planecrash[2][3]. You're in the Ultimatum game. You're offered 0-10 dollars. You can accept or reject the offer. If you accept, you get what's offered, and the offerer gets $(10-offer). If you reject, both you and the offerer get nothing. The simplest strategy that incentivizes fair splits is to accept everything 5 and reject everything < 5. The offerer can't do better than by offering you 5. If you accepted offers of 1, the offerer that knows this would always offer you 1 and get 9, instead of being incentivized to give you 5. Being unexploitable in the sense of incentivizing fair splits is a very important property that your strategy might have. With the simplest strategy, if you're offered 5..10, you get 5..10; if you're offered 0..4, you get 0 in expectation. Can you do better than that? What is a strategy that you could use that would get more than 0 in expectation if you're offered 1..4, while still being unexploitable (i.e., still incentivizing splits of at least 5)? I encourage you to stop here and try to come up with a strategy before continuing. The solution, explained by Yudkowsky in planecrash (children split 12 jellychips, so the offers are 0..12): When the children return the next day, the older children tell them the correct solution to the original Ultimatum Game. It goes like this: When somebody offers you a 7:5 split, instead of the 6:6 split that would be fair, you should accept their offer with slightly less than 6/7 probability. Their expected value from offering you 7:5, in this case, is 7 * slightly less than 6/7, or slightly less than 6. This ensures they can't do any better by offering you an unfair split; but neither do you try to destroy all their expected value in retaliation. It could be an honest mistake, especially if the real situation is any more complicated than the original Ultimatum Game. If they offer you 8:4, accept with probability slightly-more-less than 6/8, so they do even worse in their own expectation by offering you 8:4 than 7:5. It's not about retaliating harder, the harder they hit you with an unfair price - that point gets hammered in pretty hard to the kids, a Watcher steps in to repeat it. This setup isn't about retaliation, it's about what both sides have to do, to turn the problem of dividing the gains, into a matter of fairness; to create the incentive setup whereby both sides don't expect to do any better by distorting their own estimate of what is 'fair'. [The next stage involves a complicated dynamic-puzzle with two stations, that requires two players working simultaneously to solve. After it's been solved, one player locks in a number on a 0-12 dial, the other player may press a button, and the puzzle station spits out jellychips thus divided. The gotcha is, the 2-player puzzle-game isn't always of equal difficulty for both players. Sometimes, one of them needs to work a lot harder than the other.] They play the 2-station video games again. There's less anger and shouting this time. Sometimes, somebody rolls a continuous-die and then rejects somebody's offer, but whoever gets rejected knows that they're not being punished. Everybody is just following the Algorithm. Your notion of fairness didn't match their notion of fairness, and they did what the Algorithm says to do in that case, but ...
undefined
Sep 12, 2024 • 15min

LW - Contra papers claiming superhuman AI forecasting by nikos

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Contra papers claiming superhuman AI forecasting, published by nikos on September 12, 2024 on LessWrong. [Conflict of interest disclaimer: We are FutureSearch, a company working on AI-powered forecasting and other types of quantitative reasoning. If thin LLM wrappers could achieve superhuman forecasting performance, this would obsolete a lot of our work.] Widespread, misleading claims about AI forecasting Recently we have seen a number of papers - (Schoenegger et al., 2024, Halawi et al., 2024, Phan et al., 2024, Hsieh et al., 2024) - with claims that boil down to "we built an LLM-powered forecaster that rivals human forecasters or even shows superhuman performance". These papers do not communicate their results carefully enough, shaping public perception in inaccurate and misleading ways. Some examples of public discourse: Ethan Mollick (>200k followers) tweeted the following about the paper Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy by Schoenegger et al.: A post on Marginal Revolution with the title and abstract of the paper Approaching Human-Level Forecasting with Language Models by Halawi et al. elicits responses like "This is something that humans are notably terrible at, even if they're paid to do it. No surprise that LLMs can match us." "+1 The aggregate human success rate is a pretty low bar" A Twitter thread with >500k views on LLMs Are Superhuman Forecasters by Phan et al. claiming that "AI […] can predict the future at a superhuman level" had more than half a million views within two days of being published. The number of such papers on AI forecasting, and the vast amount of traffic on misleading claims, makes AI forecasting a uniquely misunderstood area of AI progress. And it's one that matters. What does human-level or superhuman forecasting mean? "Human-level" or "superhuman" is a hard-to-define concept. In an academic context, we need to work with a reasonable operationalization to compare the skill of an AI forecaster with that of humans. One reasonable and practical definition of a superhuman forecasting AI forecaster is The AI forecaster is able to consistently outperform the crowd forecast on a sufficiently large number of randomly selected questions on a high-quality forecasting platform.[1] (For a human-level forecaster, just replace "outperform" with "performs on par with".) Except for Halawi et al., the papers had a tendency to operationalize human-level or superhuman forecasting in ways falling short of that standard. Some issues we saw were: Looking at average/random instead of aggregate or top performance (for superhuman claims) Looking at only at a small number of questions Choosing a (probably) relatively easy target (i.e. Manifold) Red flags for claims to (super)human AI forecasting accuracy Our experience suggests there are a number of things that can go wrong when building AI forecasting systems, including: 1. Failing to find up-to-date information on the questions. It's inconceivable on most questions that forecasts can be good without basic information. Imagine trying to forecast the US presidential election without knowing that Biden dropped out. 2. Drawing on up-to-date, but low-quality information. Ample experience shows low quality information confuses LLMs even more than it confuses humans. Imagine forecasting election outcomes with biased polling data. Or, worse, imagine forecasting OpenAI revenue based on claims like > The number of ChatGPT Plus subscribers is estimated between 230,000-250,000 as of October 2023. without realising that this mixing up ChatGPT vs ChatGPT mobile. 3. Lack of high-quality quantitative reasoning. For a decent number of questions on Metaculus, good forecasts can be "vibed" by skilled humans and perhaps LLMs. But for many questions, simple calculations ...
undefined
Sep 12, 2024 • 17min

EA - Growth theory for EAs - reading list and summary by Karthik Tadepalli

Karthik Tadepalli, an insightful author on economic growth and effective altruism, explores the vital intersection of growth theory and its impacts on global development and AI. He curates a reading list tailored for effective altruists, emphasizing important foundational concepts. Tadepalli discusses contrasting growth theories, highlights challenges in developing nations, and examines how resource misallocation affects productivity. His analysis provides a framework for understanding how economic models can inform contemporary societal issues.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner