The Nonlinear Library cover image

The Nonlinear Library

Latest episodes

undefined
Aug 29, 2024 • 13min

AF - Solving adversarial attacks in computer vision as a baby version of general AI alignment by stanislavfort

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Solving adversarial attacks in computer vision as a baby version of general AI alignment, published by stanislavfort on August 29, 2024 on The AI Alignment Forum. I spent the last few months trying to tackle the problem of adversarial attacks in computer vision from the ground up. The results of this effort are written up in our new paper Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness (explainer on X/Twitter). Taking inspiration from biology, we reached state-of-the-art or above state-of-the-art robustness at 100x - 1000x less compute, got human-understandable interpretability for free, turned classifiers into generators, and designed transferable adversarial attacks on closed-source (v)LLMs such as GPT-4 or Claude 3. I strongly believe that there is a compelling case for devoting serious attention to solving the problem of adversarial robustness in computer vision, and I try to draw an analogy to the alignment of general AI systems here. 1. Introduction In this post, I argue that the problem of adversarial attacks in computer vision is in many ways analogous to the larger task of general AI alignment. In both cases, we are trying to faithfully convey an implicit function locked within the human brain to a machine, and we do so extremely successfully on average. Under static evaluations, the human and machine functions match up exceptionally well. However, as is typical in high-dimensional spaces, some phenomena can be relatively rare and basically impossible to find by chance, yet ubiquitous in their absolute count. This is the case for adversarial attacks - imperceptible modifications to images that completely fool computer vision systems and yet have virtually no effect on humans. Their existence highlights a crucial and catastrophic mismatch between the implicit human vision function and the function learned by machines - a mismatch that can be exploited in a dynamic evaluation by an active, malicious agent. Such failure modes will likely be present in more general AI systems, and our inability to remedy them even in the more restricted vision context (yet) does not bode well for the broader alignment project. This is a call to action to solve the problem of adversarial vision attacks - a stepping stone on the path to aligning general AI systems. 2. Communicating implicit human functions to machines The basic goal of computer vision can be viewed as trying to endow a machine with the same vision capabilities a human has. A human carries, locked inside their skull, an implicit vision function mapping visual inputs into semantically meaningful symbols, e.g. a picture of a tortoise into a semantic label tortoise. This function is represented implicitly and while we are extremely good at using it, we do not have direct, conscious access to its inner workings and therefore cannot communicate it to others easily. To convey this function to a machine, we usually form a dataset of fixed images and their associated labels. We then use a general enough class of functions, typically deep neural networks, and a gradient-based learning algorithm together with backpropagation to teach the machine how to correlate images with their semantic content, e.g. how to assign a label parrot to a picture of a parrot. This process is extremely successful in communicating the implicit human vision function to the computer, and the implicit human and explicit, learned machine functions agree to a large extent. The agreement between the two is striking. Given how different the architectures are (a simulated graph-like function doing a single forward pass vs the wet protein brain of a mammal running continuous inference), how different the learning algorithms are (gradient descent with backpropagation vs something completely different but still unknown), a...
undefined
Aug 29, 2024 • 4min

EA - Announcing the Strategic Animal Funding Circle! by JamesÖz

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Announcing the Strategic Animal Funding Circle!, published by JamesÖz on August 29, 2024 on The Effective Altruism Forum. (Yes, yet another funding circle. Thank Ambitious Impact.) I'm excited to announce the launch of the Strategic Animal Funding Circle, a new group of donors looking to support cost-effective and high-impact farmed animal welfare nonprofits during a critical growth phase where targeted funding can have an immense impact. We've just launched our first Request For Proposals which you can see in this document, as well as being outlined below. Apply here by September 20th to be considered for our first funding round. We aim for applicants to hear back on a decision by early November. If there are any donors (either currently giving or willing to give upwards of $100-250k to farmed animals per year) interested in joining, feel free to email me at jozden[at]mobius.life. We expect the funding circle will expose donors to more high-impact opportunities, create the space for important discussions and reduce vetting inefficiencies. Request for Proposals About this Funding Opportunity We are a group of donors looking to support cost-effective and high-impact farmed animal welfare and protection nonprofits during a critical growth phase where targeted funding can have an immense impact. Grants will initially be one-off support, though renewals may be possible in some cases at the discretion of individual donors. In this round, we expect up to $1,000,000 will be available for funding across all grants - though this depends on the funder-applicant fit. Eligibility Criteria: The intervention is focused on improving outcomes in farmed animal welfare or protection which could include reducing the suffering of farmed animals, promoting the development of alternative proteins or reducing the consumption of animal products. The intervention can be service delivery, policy work, or any other intervention type that delivers measurable impact in improving animal welfare outcomes. You are a registered not-for-profit entity or are partnering with one as a fiscal agent. Has the potential and willingness to scale; and Preference is given towards organizations with less than 4 years of operating history, or less than 500k USD in annual budget. Selection Criteria: What Makes a Competitive Application? Competitive applications will have the following qualities: A Great Idea: The applicant is proposing an innovative program where funding can catalyze progress. Demonstrated Funding Gap: Support will enable work which wouldn't otherwise happen. We are most excited about areas that are neglected by other major farmed animal funders. Indications of Impact: The applicant can demonstrate results through metrics, internal data, or models backed by external data. Especially strong applicants state a "north star" metric by which they want to be held accountable in ~5 years. Scalability: The applicant has concrete expansion plans to scale and achieve significantly more impact. Continuity/Sustainability Plan: The applicant explains clearly how they will build on this grant. Cost Effectiveness: We are looking to support interventions that are highly cost-effective in achieving positive outcomes for animals. Quality of Evidence: What studies, data, or information exists to indicate that the program is or will be successful? How to Apply: Apply here by September 20th to be considered for our first funding round. We aim for applicants to hear back on a decision by the end of October 30th. Future Rounds: We plan to invite and review applications twice per year, one round in Fall and another in Spring. About the Strategic Animal Funding Circle: We are a group of donors who support promising farmed animal welfare and animal protection nonprofits during the critical growth phase where targeted fund...
undefined
Aug 29, 2024 • 8min

LW - How to hire somebody better than yourself by lukehmiles

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to hire somebody better than yourself, published by lukehmiles on August 29, 2024 on LessWrong. TLDR: Select candidates heterogeneously, then give them all a very hard test, only continue with candidates that do very well (accept that you lose some good ones), and only then judge on interviews/whatever. I'm no expert but I've made some recommendations that turned out pretty well -- maybe like 5 ever. This post would probably be better if I waited 10 years to write it. Nonetheless, I think my method is far better than what most orgs/corps do. If you have had mad hiring success (judging by what your org accomplished) then please comment! Half-remembered versions of Paul Graham's taste thing and Yudkowsky's Vinge's Law have lead some folks to think that judging talent above your own is extremely difficult. I do not think so. Prereqs: It's the kind of position where someone super good at it can generate a ton of value - eg sales/outreach, coding, actual engineering, research, management, ops, ... Lots of candidates are available and you expect at least some of them are super good at the job. You have at least a month to look. It's possible for someone to demonstrate extreme competence at this type of job in a day or two. Your org is trying to do a thing - rather than be a thing. You want to succeed at that thing - ie you don't have some other secret goal. Your goal with hiring people is to do that thing better/faster - ie you don't need more friends or a prestige bump. Your work situation does not demand that you look stand-out competent - ie you don't unemploy yourself if you succeed in hiring well. You probably don't meet the prereqs. You are probably in it for the journey more than the destination; your life doesn't improve if org goals are achieved; your raises depend on you not out-hiring yourself; etc. Don't feel bad - it is totally ok to be an ordinary social creature! Being a goal psycho often sucks in every way except all the accomplished goals. If you do meet the prereqs, then good news, hiring is almost easy. You just need to find people who are good at doing exactly what you need done. Here's the method: Do look at performance (measure it yourself) Accept noise Don't look at anything else (yet) Except that they work hard Do look at performance Measure it yourself. Make up a test task. You need something that people can take without quitting their jobs or much feedback from you; you and the candidate should not become friends during the test; a timed 8-hour task is a reasonable starting point. Most importantly, you must be able to quickly and easily distinguish good results from very good results. The harder the task, the easier it is to judge the success of top attempts. If you yourself cannot complete the task at all, then congratulations, you now have a method to judge talent far above your own. Take that, folk Vinge's law. Important! Make the task something where success really does tell you they'll do the job well. Not a proxy IQ test or leetcode. The correlation is simply not high enough. Many people think they just need to hire someone generally smart and capable. I disagree, unless your org is very large or nebulous. This task must also not be incredibly lame or humiliating, or you will only end up hiring people lacking a spine. (Common problem.) Don't filter out the spines. It can be hard to think of a good test task but it is well worth all the signal you will get. Say you are hiring someone to arrange all your offices. Have applicants come arrange a couple offices and see if people like it. Pretty simple. Say you are hiring someone to build a house. Have contractors build a shed in one day. Ten sheds only cost like 5% of what a house costs, but bad builders will double your costs and timeline. Pay people as much as you can for their time and the...
undefined
Aug 28, 2024 • 2min

LW - "Deception Genre" What Books are like Project Lawful? by Double

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Deception Genre" What Books are like Project Lawful?, published by Double on August 28, 2024 on LessWrong. This post is spoiler-free I just finished Project Lawful, a really long, really weird book by Eliezer Yudkowsky. The book's protagonist is a knowledgable and perceptive target. A conspiracy forms around the target to learn from him while keeping him from finding out that helping them is not in the target's best interests. The book is written from the perspective of both the target and the conspiracists. The target notices inconsistencies and performs experiments to test his false reality while also acting in the fabricated reality according to his interests. The conspiracists frantically try to keep the target from catching them or building enough evidence against them that he concludes they have been lying. This is a description of (part of) the plot of Project Lawful. But this could be the description of an entire genre! If the genre doesn't already have a name, it could be the "Deception Genre." Another work in this category would be The Truman Show, which fits the deception and the target's escape within a <2hr movie runtime. Other stories with lying don't really have the same structure. Walter White in Breaking Bad is trying to keep his crimes hidden but isn't constructing a false reality around the cops or his family. Death Note comes close, though Light tries to mislead L about specifically who Kira is and how the Death Note works rather than constructing an entire false reality around L. Many stories about dystopias have the protagonists discover that their realities are false, but fewer of those feature the perspectives of the conspiracists frantically trying to keep the deception running. Do you know any other stories in the Deception Genre? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Aug 28, 2024 • 4min

LW - things that confuse me about the current AI market. by DMMF

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: things that confuse me about the current AI market., published by DMMF on August 28, 2024 on LessWrong. Paging Gwern or anyone else who can shed light on the current state of the AI market - I have several questions. Since the release of ChatGPT, at least 17 companies, according to the LMSYS Chatbot Arena Leaderboard, have developed AI models that outperform it. These companies include Anthropic, NexusFlow, Microsoft, Mistral, Alibaba, Hugging Face, Google, Reka AI, Cohere, Meta, 01 AI, AI21 Labs, Zhipu AI, Nvidia, DeepSeek, and xAI. Since GPT-4's launch, 15 different companies have reportedly created AI models that are smarter than GPT-4. Among them are Reka AI, Meta, AI21 Labs, DeepSeek AI, Anthropic, Alibaba, Zhipu, Google, Cohere, Nvidia, 01 AI, NexusFlow, Mistral, and xAI. Twitter AI (xAI), which seemingly had no prior history of strong AI engineering, with a small team and limited resources, has somehow built the third smartest AI in the world, apparently on par with the very best from OpenAI. The top AI image generator, Flux AI, which is considered superior to the offerings from OpenAI and Google, has no Wikipedia page, barely any information available online, and seemingly almost no employees. The next best in class, Midjourney and Stable Diffusion, also operate with surprisingly small teams and limited resources. I have to admit, I find this all quite confusing. I expected companies with significant experience and investment in AI to be miles ahead of the competition. I also assumed that any new competitors would be well-funded and dedicated to catching up with the established leaders. Understanding these dynamics seems important because they influence the merits of things like a potential pause in AI development or the ability of China to outcompete the USA in AI. Moreover, as someone with general market interests, the valuations of some of these companies seem potentially quite off. So here are my questions: 1. Are the historically leading AI organizations - OpenAI, Anthropic, and Google - holding back their best models, making it appear as though there's more parity in the market than there actually is? 2. Is this apparent parity due to a mass exodus of employees from OpenAI, Anthropic, and Google to other companies, resulting in the diffusion of "secret sauce" ideas across the industry? 3. Does this parity exist because other companies are simply piggybacking on Meta's open-source AI model, which was made possible by Meta's massive compute resources? Now, by fine-tuning this model, can other companies quickly create models comparable to the best? 4. Is it plausible that once LLMs were validated and the core idea spread, it became surprisingly simple to build, allowing any company to quickly reach the frontier? 5. Are AI image generators just really simple to develop but lack substantial economic reward, leading large companies to invest minimal resources into them? 6. Could it be that legal challenges in building AI are so significant that big companies are hesitant to fully invest, making it appear as if smaller companies are outperforming them? 7. And finally, why is OpenAI so valuable if it's apparently so easy for other companies to build comparable tech? Conversely, why are these no name companies making leading LLMs not valued higher? Of course, the answer is likely a mix of the factors mentioned above, but it would be very helpful if someone could clearly explain the structures affecting the dynamics highlighted here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Aug 28, 2024 • 21min

EA - Statistical foundations for worldview diversification by Karthik Tadepalli

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Statistical foundations for worldview diversification, published by Karthik Tadepalli on August 28, 2024 on The Effective Altruism Forum. Note: this has been in my drafts for a long time, and I just decided to let it go without getting too hung up on details, so this is much rougher than it should be. Summary: Worldview diversification seems hard to justify philosophically, because it results in lower expected value than going with a single worldview that has the highest EV. I show that you can justify worldview diversification as the solution to a decision problem under uncertainty. The first way is to interpret worldview diversification as a minimax strategy, in which you maximize the worst-case utility of your allocation. The second way is as an approximate solution to the problem of maximizing expected utility for a risk-averse decision maker. Overview Alexander Berger: ...the central idea of worldview diversification is that the internal logic of a lot of these causes might be really compelling and a little bit totalizing, and you might want to step back and say, "Okay, I'm not ready to go all in on that internal logic." So one example would be just comparing farm animal welfare to human causes within the remit of global health and wellbeing. One perspective on farm animal welfare would say, "Okay, we're going to get chickens out of cages. I'm not a speciesist and I think that a chicken-day suffering in the cage is somehow very similar to a human-day suffering in a cage, and I should care similarly about these things." I think another perspective would say, "I would trade an infinite number of chicken-days for any human experience. I don't care at all." If you just try to put probabilities on those views and multiply them together, you end up with this really chaotic process where you're likely to either be 100% focused on chickens or 0% focused on chickens. Our view is that that seems misguided. It does seem like animals could suffer. It seems like there's a lot at stake here morally, and that there's a lot of cost-effective opportunities that we have to improve the world this way. But we don't think that the correct answer is to either go 100% all in where we only work on farm animal welfare, or to say, "Well, I'm not ready to go all in, so I'm going to go to zero and not do anything on farm animal welfare." ... Rob Wiblin: Yeah. It feels so intuitively clear that when you're to some degree picking these numbers out of a hat, you should never go 100% or 0% based on stuff that's basically just guesswork. I guess, the challenge here seems to have been trying to make that philosophically rigorous, and it does seem like coming up with a truly philosophically grounded justification for that has proved quite hard. But nonetheless, we've decided to go with something that's a bit more cluster thinking, a bit more embracing common sense and refusing to do something that obviously seems mad. Alexander Berger: And I think part of the perspective is to say look, I just trust philosophy a little bit less. So the fact that something might not be philosophically rigorous... I'm just not ready to accept that as a devastating argument against it. 80000 hours This note explains how you might arrive at worldview diversification from a formal framework. I don't claim it is the only way you might arrive at it, and I don't claim that it captures everyone's intuitions for why worldview diversification is a good idea. It only captures my intuitions, and formalizes them in a way that might be helpful for others. Suppose a decisionmaker wants to allocate money across different cause areas. But the marginal social value of money to each cause area is unknown/known with error (e.g. moral weights, future forecasts), so they don't actually know how to maximize social value ex ante. What sh...
undefined
Aug 28, 2024 • 4min

EA - Legal Impact for Chickens is Hiring an Attorney by KathrynLIC

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Legal Impact for Chickens is Hiring an Attorney, published by KathrynLIC on August 28, 2024 on The Effective Altruism Forum. Legal Impact for Chickens is hiring a Staff Attorney or Managing Attorney. We are prioritizing applications submitted by October 7. About us: Legal Impact for Chickens (LIC) is a 501(c)(3) litigation nonprofit. We work to protect farmed animals. You may have seen our Costco shareholder derivative suit in The Washington Post, Fox Business, or CNN Business - or even on TikTok. Or perhaps you saw LIC recommended by Animal Charity Evaluators. Now, we're looking for our next hire - an entrepreneurial litigator to help fight for animals! About you: • 2+ years of litigation experience (for staff attorney) • 6+ years of litigation experience (for managing attorney) • Licensed and in good standing with the state bar where you live • Excellent analytical, writing, and verbal-communication skills • Zealous, creative, enthusiastic litigator • Passion for helping farmed animals • Interest in entering a startup nonprofit on the ground floor, and helping to build something • Willing to do all types of nonprofit startup work, beyond just litigation • Strong work ethic and initiative • Kind to our fellow humans, and excited about creating a welcoming, inclusive team • Experience supervising staff, interns, contractors, or volunteers (for managing attorney) We encourage candidates with most of the above to apply; we do not expect all candidates to fit this job description 100%. About the role: You will be an integral part of LIC. You'll help shape our organization's future. Your role will be a combination of (1) designing and pursuing creative impact litigation for animals, and (2) helping with everything else we need to do, to run this new nonprofit! Since this is such a small organization, you'll wear many hats: Sometimes you may wear a law-firm partner's hat, making litigation strategy decisions or covering a hearing on your own. Sometimes you'll wear an associate's hat, analyzing complex and novel legal issues. Sometimes you'll pitch in on administrative tasks, making sure a brief gets filed properly or formatting a table of authorities. Sometimes you'll wear a start-up founder's hat, helping plan the number of employees we need, or representing LIC at conferences. We can only promise it won't be dull! This job offers tremendous opportunity for advancement, in the form of helping to lead LIC as we grow. The hope is for you to become an indispensable, long-time member of our new team. Commitment: Full time Location and travel: This is a remote, U.S.-based position. You must be available to travel for work as needed, since we will litigate all over the country. Reports to: Alene Anello, LIC's president Salary: $80,000-$130,000 depending on experience and role. (E.g. from $80,000 for someone with two years of litigation experience, up to $130,000 for someone with 15 years or more of litigation experience.) One more thing! LIC is an equal opportunity employer. Women and people of color are strongly encouraged to apply. Applicants will receive consideration for employment without regard to race, religion, gender, sexual orientation, national origin, disability, age, or veteran status. To Apply: To apply, please fill out this form by October 7, 2024. If the link doesn't work, please copy-and-paste this into your browser: https://forms.monday.com/forms/d0bd6cda313e3aac650fd92b86697f61?r=use1 Thank you for your time and your compassion! Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Aug 28, 2024 • 3min

LW - Unit economics of LLM APIs by dschwarz

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Unit economics of LLM APIs, published by dschwarz on August 28, 2024 on LessWrong. Disclaimer 1: Our calculations are rough in places; information is sparse, guesstimates abound. Disclaimer 2: This post draws from public info on FutureSearch as well as a paywalled report. If you want the paywalled numbers, email dan@futuresearch.ai with your LW account name and we'll send you the report for free. Here's our view of the unit economics of OpenAI's API. Note: this considers GPT-4-class models only, not audio or image APIs, and only direct API traffic, not usage in ChatGPT products. As of June 2024, OpenAI's API was very likely profitable, with surprisingly high margins. Our median estimate for gross margin (not including model training costs or employee salaries) was 75%. Once all traffic switches over to the new August GPT-4o model and pricing, OpenAI plausibly still will have a healthy profit margin. Our median estimate for the profit margin is 55%. The Information implied that OpenAI rents ~60k A100-equivalents from Microsoft for non-ChatGPT inference. If this is true, OpenAI is massively overprovisioned for the API, even when we account for the need to rent many extra GPUs to account for traffic spikes and future growth (arguably creating something of a mystery). We provide an explicit, simplified first-principles calculation of inference costs for the original GPT-4, and find significantly lower throughput & higher costs than Benjamin Todd's result (which drew from Semianalysis). Summary chart: What does this imply? With any numbers, we see two major scenarios: Scenario one: competition intensifies. With llama, Gemini, and Claude all comparable and cheap, OpenAI will be forced to again drop their prices in half. (With their margins FutureSearch calculates, they can do this without running at a loss.) LLM APIs become like cloud computing: huge revenue, but not very profitable. Scenario two: one LLM pulls away in quality. GPT-5 and Claude-3.5-opus might come out soon at huge quality improvements. If only one LLM is good enough for important workflows (like agents), it may be able to sustain a high price and huge margins. Profits will flow to this one winner. Our numbers update us, in either scenario, towards: An increased likelihood of more significant price drops for GPT-4-class models. A (weak) update that frontier labs are facing less pressure today to race to more capable models. If you thought that GPT-4o (and Claude, Gemini, and hosted versions of llama-405b) were already running at cost in the API, or even at a loss, you would predict that the providers are strongly motivated to release new models to find profit. If our numbers are approximately correct, these businesses may instead feel there is plenty of margin left, and profit to be had, even if GPT-5 and Claude-3.5-opus etc. do not come out for many months. More info at https://futuresearch.ai/openai-api-profit. Feedback welcome and appreciated - we'll update our estimates accordingly. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Aug 28, 2024 • 3min

LW - In defense of technological unemployment as the main AI concern by tailcalled

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: In defense of technological unemployment as the main AI concern, published by tailcalled on August 28, 2024 on LessWrong. It seems to me that when normal people are concerned about AI destroying their life, they are mostly worried about technological unemployment, whereas rationalists think that it is a bigger risk that the AI might murder us all, and that automation gives humans more wealth and free time and is therefore good. I'm not entirely unsympathetic to the rationalist position here. If we had a plan for how to use AI to create a utopia where humanity could thrive, I'd be all for it. We have problems (like death) that we are quite far from solving, and which it seems like a superintelligence could in principle quickly solve. But this requires value alignment: we need to be quite careful what we mean by concepts like "humanity", "thrive", etc., so the AI can explicitly maintain good conditions. What kinds of humans do we want, and what kinds of thriving should they have? This needs to be explicitly planned by any agent which solves this task. Our current society doesn't say "humans should thrive", it says "professional humans should thrive"; certain alternative types of humans like thieves are explicitly suppressed, and other types of humans like beggars are not exactly encouraged. This is of course not an accident: professionals produce value, which is what allows society to exist in the first place. But with technological unemployment, we decouple professional humans from value production, undermining the current society's priority of human welfare. This loss is what causes existential risk. If humanity was indefinitely competitive in most tasks, the AIs would want to trade with us or enslave us instead of murdering us or letting us starve to death. Even if we manage to figure out how to value-align AIs, this loss leads to major questions about what to value-align the AIs to, since e.g. if we value human capabilities, the fact that those capabilities become uncompetitive likely means that they will diminish to the point of being vestigial. It's unclear how to solve this problem. Eliezer's original suggestion was to keep humans more capable than AIs by increasing the capabilities of humans. Yet even increasing the capabilities of humanity is difficult, let alone keeping up with technological development. Robin Hanson suggests that humanity should just sit back and live off our wealth as we got replaced. I guess that's the path we're currently on, but it is really dubious to me whether we'll be able to keep that wealth, and whether the society that replaces us will have any moral worth. Either way, these questions are nearly impossible to separate from the question of, what kinds of production will be performed in the future? Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Aug 28, 2024 • 13min

LW - Am I confused about the "malign universal prior" argument? by nostalgebraist

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Am I confused about the "malign universal prior" argument?, published by nostalgebraist on August 28, 2024 on LessWrong. In a 2016 blog post, Paul Christiano argued that the universal prior (hereafter "UP") may be "malign." His argument has received a lot of follow-up discussion, e.g. in Mark Xu's The Solomonoff Prior is Malign Charlie Steiner's The Solomonoff prior is malign. It's not a big deal. among other posts. This argument never made sense to me. The reason it doesn't make sense to me is pretty simple, but I haven't seen it mentioned explicitly in any of the ensuing discussion. This leaves me feeling like either I am misunderstanding the argument in a pretty fundamental way, or that there is a problem with the argument that has gotten little attention from the argument's critics (in which case I don't understand why). I would like to know which of these is the case, and correct my misunderstanding if it exists, hence this post. (Note: In 2018 I wrote a comment on the original post where I tried to state one of my objections to my argument, though I don't feel I expressed myself especially well there.) UP-using "universes" and simulatable "universes" The argument for malignity involves reasoning beings, instantiated in Turing machines (TMs), which try to influence the content of the UP in order to affect other beings who are making decisions using the UP. Famously, the UP is uncomputable. This means the TMs (and reasoning beings inside the TMs) will not be able to use[1] the UP themselves, or simulate anyone else using the UP. At least not if we take "using the UP" in a strict and literal sense. Thus, I am unsure how to interpret claims (which are common in presentations of the argument) about TMs "searching for universes where the UP is used" or the like. For example, from Mark Xu's "The Solomonoff Prior is Malign": In particular, this suggests a good strategy for consequentialists: find a universe that is using a version of the Solomonoff prior that has a very short description of the particular universe the consequentialists find themselves in. Or, from Christiano's original post: So the first step is getting our foot in the door - having control over the parts of the universal prior that are being used to make important decisions. This means looking across the universes we care about, and searching for spots within those universe where someone is using the universal prior to make important decisions. In particular, we want to find places where someone is using a version of the universal prior that puts a lot of mass on the particular universe that we are living in, because those are the places where we have the most leverage. Then the strategy is to implement a distribution over all of those spots, weighted by something like their importance to us (times the fraction of mass they give to the particular universe we are in and the particular channel we are using). That is, we pick one of those spots at random and then read off our subjective distribution over the sequence of bits that will be observed at that spot (which is likely to involve running actual simulations). What exactly are these "universes" that are being searched over? We have two options: 1. They are not computable universes. They permit hypercomputation that can leverage the "actual" UP, in its full uncomputable glory, without approximation. 2. They are computible universes. Thus the UP cannot be used in them. But maybe there is some computible thing that resembles or approximates the UP, and gets used in these universes. Option 1 seems hard to square with the talk about TMs "searching for" universes or "simulating" universes. A TM can't do such things to the universes of option 1. Hence, the argument is presumably about option 2. That is, although we are trying to reason about the content of...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app