The Nonlinear Library: LessWrong

The Nonlinear Fund
undefined
Jun 28, 2024 • 17min

LW - Corrigibility = Tool-ness? by johnswentworth

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Corrigibility = Tool-ness?, published by johnswentworth on June 28, 2024 on LessWrong. Goal of This Post I have never seen anyone give a satisfying intuitive explanation of what corrigibility (in roughly Eliezer's sense of the word) is. There's lists of desiderata, but they sound like scattered wishlists which don't obviously point to a unified underlying concept at all. There's also Eliezer's extremely meta pointer: We can imagine, e.g., the AI imagining itself building a sub-AI while being prone to various sorts of errors, asking how it (the AI) would want the sub-AI to behave in those cases, and learning heuristics that would generalize well to how we would want the AI to behave if it suddenly gained a lot of capability or was considering deceiving its programmers and so on. … and that's basically it.[1] In this post, we're going to explain a reasonably-unified concept which seems like a decent match to "corrigibility" in Eliezer's sense. Tools Starting point: we think of a thing as corrigible exactly insofar as it is usefully thought-of as a tool. A screwdriver, for instance, is an excellent central example of a corrigible object. For AI alignment purposes, the challenge is to achieve corrigibility - i.e. tool-ness - in much more general, capable, and intelligent systems. … that all probably sounds like a rather nebulous and dubious claim, at this point. In order for it to make sense, we need to think through some key properties of "good tools", and also how various properties of incorrigibility make something a "bad tool". We broke off a separate post on what makes something usefully thought-of as a tool. Key ideas: Humans tend to solve problems by finding partial plans with "gaps" in them, where the "gaps" are subproblems which the human will figure out later. For instance, I might make a plan to decorate my apartment with some paintings, but leave a "gap" about how exactly to attach the paintings to the wall; I can sort that out later.[2] Sometimes many similar subproblems show up in my plans, forming a cluster.[3] For instance, there's a cluster (and many subclusters) of subproblems which involve attaching things together. Sometimes a thing (a physical object, a technique, whatever) makes it easy to solve a whole cluster of subproblems. That's what tools are. For instance, a screwdriver makes it easy to solve a whole subcluster of attaching-things-together subproblems. How does that add up to corrigibility? Respecting Modularity One key piece of the above picture is that the gaps/subproblems in humans' plans are typically modular - i.e. we expect to be able to solve each subproblem without significantly changing the "outer" partial plan, and without a lot of coupling between different subproblems. That's what makes the partial plan with all its subproblems useful in the first place: it factors the problem into loosely-coupled subproblems. Claim from the tools post: part of what it means for a tool to solve a subproblem-cluster is that the tool roughly preserves the modularity of that subproblem-cluster. That means the tool should not have a bunch of side effects which might mess with other subproblems, or mess up the outer partial plan. Furthermore, the tool needs to work for a whole subproblem-cluster, and that cluster includes similar subproblems which came up in the context of many different problems. So, the tool needs to robustly not have side effects which mess up the rest of the plan, across a wide range of possibilities for what "the rest of the plan" might be. Concretely: a screwdriver which sprays flames out the back when turned is a bad tool; it usually can't be used to solve most screw-turning subproblems when the bigger plan takes place in a wooden building. Another bad tool: a screwdriver which, when turned, also turns the lights on and off, cau...
undefined
Jun 28, 2024 • 3min

LW - Secondary forces of debt by KatjaGrace

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Secondary forces of debt, published by KatjaGrace on June 28, 2024 on LessWrong. A general thing I hadn't noticed about debts until lately: Whenever Bob owes Alice, then Alice has reason to look after Bob, to the extent that increases the chance he satisfies the debt. Yet at the same time, Bob has an incentive for Alice to disappear, insofar as it would relieve him. These might be tiny incentives, and not overwhelm for instance Bob's many reasons for not wanting Alice to disappear. But the bigger the owing, the more relevant the incentives. When big enough, the former comes up as entities being "too big to fail", and potentially rescued from destruction by those who would like them to repay or provide something expected of them in future. But the opposite must exist also: too big to succeed - where the abundance owed to you is so off-putting to provide that those responsible for it would rather disempower you. And if both kinds of incentive are around in whisps whenever there is a debt, surely they often get big enough to matter, even before they become the main game. For instance, if everyone around owes you a bit of money, I doubt anyone will murder you over it. But I wouldn't be surprised if it motivated a bit more political disempowerment for you on the margin. There is a lot of owing that doesn't arise from formal debt, where these things also apply. If we both agree that I - as your friend - am obliged to help you get to the airport, you may hope that I have energy and fuel and am in a good mood. Whereas I may (regretfully) be relieved when your flight is canceled. Money is an IOU from society for some stuff later, so having money is another kind of being owed. Perhaps this is part of the common resentment of wealth. I tentatively take this as reason to avoid debt in all its forms more: it's not clear that the incentives of alliance in one direction make up for the trouble of the incentives for enmity in the other. And especially so when they are considered together - if you are going to become more aligned with someone, better it be someone who is not simultaneously becoming misaligned with you. Even if such incentives never change your behavior, every person you are obligated to help for an hour on their project is a person for whom you might feel a dash of relief if their project falls apart. And that is not fun to have sitting around in relationships. (Inpsired by reading The Debtor's Revolt by Ben Hoffman lately, which may explicitly say this, but it's hard to be sure because I didn't follow it very well. Also perhaps inspired by a recent murder mystery spree, in which my intuitions have absorbed the heuristic that having something owed to you is a solid way to get murdered.) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jun 28, 2024 • 1h 12min

LW - AI #70: A Beautiful Sonnet by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #70: A Beautiful Sonnet, published by Zvi on June 28, 2024 on LessWrong. They said it couldn't be done. No, not Claude Sonnet 3.5 becoming the clear best model. No, not the Claude-Sonnet-empowered automatic meme generators. Those were whipped together in five minutes. They said I would never get quiet time and catch up. Well, I showed them! That's right. Yes, there is a new best model, but otherwise it was a quiet week. I got a chance to incorporate the remaining biggest backlog topics. The RAND report is covered under Thirty Eight Ways to Steal Your Model Weights. Last month's conference in Seoul is covered in You've Got Seoul. I got to publish my thoughts on OpenAI's Model Spec last Friday. Table of Contents Be sure to read about Claude 3.5 Sonnet here. That is by far the biggest story. 1. Introduction. 2. Table of Contents. 3. Language Models Offer Mundane Utility. I am increasingly persuaded. 4. Language Models Don't Offer Mundane Utility. EU's DMA versus the AiPhone. 5. Clauding Along. More people, mostly impressed. 6. Fun With Image Generation. They are coming for our memes. Then Hollywood. 7. Copyright Confrontation. The RIAA does the most RIAA thing. 8. Deepfaketown and Botpocalypse Soon. Character.ai addiction. Am I out of touch? 9. They Took Our Jobs. More arguments that the issues lie in the future. 10. The Art of the Jailbreak. We need to work together as a team. 11. Get Involved. AISI, Apollo, Astra, Accra, BlueDot, Cybersecurity and DOE. 12. Introducing. Forecasting, OpenAI Mac App, Otto, Dot, Butterflies, Decagon. 13. In Other AI News. OpenAI equity takes steps forward. You can sell it. 14. Quiet Speculations. A distinct lack of mojo. 15. You've Got Seoul. Delayed coverage of the Seoul summit from last month. 16. Thirty Eight Ways to Steal Your Model Weights. Right now they would all work. 17. The Quest for Sane Regulations. Steelmanning restraint. 18. SB 1047. In Brief. 19. The Week in Audio. Dwarkesh interviews Tony Blair, and many more. 20. Rhetorical Innovation. A demolition, and also a disputed correction. 21. People Are Worried About AI Killing Everyone. Don't give up. Invest wisely. 22. Other People Are Not As Worried About AI Killing Everyone. What even is ASI? 23. The Lighter Side. Eventually the AI will learn. Language Models Offer Mundane Utility Training only on (x,y) pairs, define the function f(x), compose and invert it without in-context examples or chain of thought. AI Dungeon will let you be the DM and take the role of the party, if you prefer. Lindy 'went rogue' and closed a customer on its own. They seem cool with it? Persuasive capability of the model is proportional to the log of the model size, says paper. Author Kobi Hackenburg paints this as reassuring, but the baseline is that everything scales with the log of the model size. He says this is mostly based on 'task completion' and staying on topic improving, and current frontier models are already near perfect at that, so he is skeptical we will see further improvement. I am not. I do believe the result that none of the models was 'more persuasive than human baseline' in the test, but that is based on uncustomized messages on generic political topics. Of course we should not expect above human performance there for current models. 75% of knowledge workers are using AI, but 78% of the 75% are not telling the boss. Build a team of AI employees to write the first half of your Shopify CEO speech from within a virtual office, then spend the second half of the speech explaining how you built the team. It is so weird to think 'the best way to get results from AI employees I can come up with is to make them virtually thirsty so they will have spontaneous water cooler conversations.' That is the definition of scratching the (virtual) surface. Do a bunch of agent-based analysis off a si...
undefined
Jun 27, 2024 • 9min

LW - Schelling points in the AGI policy space by mesaoptimizer

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Schelling points in the AGI policy space, published by mesaoptimizer on June 27, 2024 on LessWrong. I've been thinking about memetically fit Schelling points in the AGI policy space. I'll describe four such "Schelling policies", and use them as pedagogical examples. Shut it all down MIRI's new stated objective is the clearest example of a Schelling policy: "Shut it all down". MIRI states that they want governments to coordinate to pause all AI research that involves smarter-than-human systems. Laypeople will find this policy easy to understand, since they can rely on the shared cultural knowledge of CFC bans and international nuclear disarmament as case studies. If you want to coordinate a large number of people coherently towards furthering a particular policy, "you get about five words" that you can make 'common knowledge' such that people can coordinate in a specific direction. The ease of communicating the policy makes a big difference in such conditions. When you attempt to communicate an idea widely, you'll notice that people usually end up with multiple slightly (or sometimes wildly) differing copies of the original idea. If you've played the Telephone game, you've experienced just how much information can be lost as an idea spreads from one person to another. In the context of policies, individual people's beliefs and incentives will warp the instantiation of the policy they will communicate and support. (For example, you'll find companies lobbying regulators to carve out exceptions that benefit them.) Here's where Schelling points are invaluable: they serve as natural attractors in the space of ideas, and therefore enable people to 'error-correct' the idea they encounter and figure out the policy that everyone is coordinating around. "Shut it all down" is a Schelling point. "Shut it all down if we see evidence of unprompted deception and power-seeking in AGI models" is not a Schelling point, you have multiple free variables that can and will be optimized to benefit the people spreading the idea -- which can result in a lack of coordination and the idea being outcompeted by memetically fitter ideas. "Prevent the training of models using compute greater than 1025 floating point operations" also has a free variable: why exactly 1025 floating point operations? Why not 1024 or 1026? Until 1025 floating point operations becomes a Schelling number, the policy containing it is not a Schelling point. Effective Accelerationism (e/acc) The biggest difference between e/acc and the PauseAI memeplexes is that e/acc doesn't seem to have a coherent set of goals and beliefs. Here are a bunch of memes that e/acc people tend to espouse: "It's time to build." (also the last line of The Techno-Optimist Manifesto) "Come and take it." (where "it" refers to GPUs here) "Accelerate or die." At a first glance, one might say that e/acc isn't a Schelling policy -- it seems less like a coherent policy, and more like a set of 'vibes', verbal and non-verbal statements designed to create a desired emotional impact, regardless of the actual content. I disagree. A policy (or well, a memeplex) does not need to have an explicitly coherent set of beliefs and goals for it to result in coordinating people towards particular consequences. You might expect this to reduce the spread rate of this particular policy, but e/acc specifically compensates for it by being significantly more fun and socially, financially, and professionally profitable to coordinate around. For example, venture capital firms such as a16z want the opportunity to make a lot of money from the gold rush that is the race to AGI, and a lot of software developers want a shot at making billions of dollars if their startup succeeds. The possibility of regulations would cause the music to stop, and they don't want that. In fact, you don...
undefined
Jun 27, 2024 • 20min

LW - Live Theory Part 0: Taking Intelligence Seriously by Sahil

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Live Theory Part 0: Taking Intelligence Seriously, published by Sahil on June 27, 2024 on LessWrong. Acknowledgements The vision here was midwifed originally in the wild and gentle radiance that is Abram's company (though essentially none of the content is explicitly his). The PIBBSS-spirit has been infused in this work from before it began (may it infuse us all), as have meetings with the Agent Foundations team at MIRI over the past ~2 years. More recently, everyone who has been loving the High Actuation project into form (very often spontaneously and without being encumbered by self-consciousness of this fact):[1] individuals include Steve Petersen, Mateusz Baginski, Aditya Prasad, Harmony, TJ, Chris Lakin; the AISC 2024 team, Murray Buchanan, Matt Farr, Arpan Agrawal, Adam, Ryan, Quinn; various people from Topos, ALIFE, MAPLE, EA Bangalore. Published while at CEEALAR. Disclaimers Very occasionally there are small remarks/questions from a remarkable human named Steve, since this and the next two posts are an edited transcript of me giving him a talk. I left them in to retain the conversational tone. Steve has also consistently been a fantastic ground for this channeling. I use the term "artefact" a fair amount in this sequence. Unfortunately for you and me, Anthropic also recently started using "artifact" in a different way. I'm using "artefact" in the common sense of the word. The British spelling should help remind of the distinction. Taking Intelligence Seriously Sahil: I gave a talk recently, at an EA event just two days ago, where I made some quick slides (on the day of the talk, so not nearly as tidy as I'd like) and attempted to walk through this so-called "live theory". (Alternative terms include "adaptive theory" or "fluid theory"; where the theories themselves are imbued with some intelligence.) Maybe I can give you that talk. I'm not sure how much of what I was saying there will be present now, but I can try. What do you think? I think it'll take about 15 minutes. Yeah? Steve: Cool. Sahil: Okay, let me give you a version of this talk that's very abbreviated. So, the title I'm sure already makes sense to you, Steve. I don't know if this is something that you know, but I prefer the word "adaptivity" over intelligence. I'm fine with using "intelligence" for this talk, but really, when I'm thinking of AI and LLMs and "live" (as you'll see later), I'm thinking, in part, of adaptive. And I think that connotes much more of the relevant phenomena, and much less controversially. It's also less distractingly "foundational", in the sense of endless questions on "what intelligence means". Failing to Take Intelligence Seriously Right. So, I want to say there are two ways to fail to take intelligence, or adaptivity, seriously. One is, you know, the classic case, of people ignoring existential risk from artificial intelligence. The old "well, it's just a computer, just software. What's the big deal? We can turn it off." We all know the story there. In many ways, this particular failure-of-imagination is much less pronounced today. But, I say, a dual failure-of-imagination is true today even among the "cognoscenti", where we ignore intelligence by ignoring opportunities from moderately capable mindlike entities at scale. I'll go over this sentence slower in the next slide. For now: there are two ways to not meet reality. On the left of the slide is "nothing will change". The same "classic" case of "yeah, what's the big deal? It's just software." On the right, it's the total singularity, of extreme unknowable super-intelligence. In fact, the phrase "technological singularity", IIRC, was coined by Vernor Vinge to mark the point that we can't predict beyond. So, it's also a way to be mind-killed. Even with whatever in-the-limit proxies we have for this, we make various sim...
undefined
Jun 26, 2024 • 2min

LW - Progress Conference 2024: Toward Abundant Futures by jasoncrawford

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Progress Conference 2024: Toward Abundant Futures, published by jasoncrawford on June 26, 2024 on LessWrong. The progress movement has grown a lot in the last few years. We now have progress journals, think tanks, and fellowships. The progress idea has spread and evolved into the "abundance agenda", "techno-optimism", "supply-side progressivism", "American dynamism". All of us want to see more scientific, technological, and economic progress for the good of humanity, and envision a bold, ambitious, flourishing future. What we haven't had so far is a regular gathering of the community. Announcing Progress Conference 2024, a two-day event to connect people in the progress movement. Meet great people, share ideas in deep conversations, catalyze new projects, get energized and inspired. Hosted by: the Roots of Progress Institute, together with the Foresight Institute, HumanProgress.org, the Institute for Humane Studies, the Institute for Progress, and Works in Progress magazine When: October 18-19, 2024 Where: Berkeley, CA - at the Lighthaven campus, an inviting space perfect for mingling Speakers: Keynotes include Patrick Collison, Tyler Cowen, Jason Crawford, and Steven Pinker. Around 20 additional speakers will share ideas on four tracks: the big idea of human progress, policy for progress, tech for progress, and storytelling/media for progress. Full speaker list Attendees: We expect 200+ intellectuals, builders, policy makers, storytellers, and students. This is an invitation-only event, but anyone can apply for an invitation. Complete the open application by July 15th. Program: Two days of intellectual exploration, inspiration and interaction that will help shape the progress movement into a cultural force. Attend talks on topics from tech to policy to culture, build relationships with new people as you hang out on cozy sofas or enjoy the sun in the garden, sign up to run an unconference session and find others who share your interests and passions, or pitch your ideas to those who could help make your dreams a reality. Special thanks to our early sponsors: Cato Institute, Astera Institute, and Freethink Media! We have more sponsorships open, view sponsorship opportunities here. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
Jun 26, 2024 • 10min

LW - What is a Tool? by johnswentworth

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is a Tool?, published by johnswentworth on June 26, 2024 on LessWrong. Throughout this post, we're going to follow the Cognition -> Convergence -> Consequences methodology[1]. That means we'll tackle tool-ness in three main stages, each building on the previous: Cognition: What does it mean, cognitively, to view or model something as a tool? Convergence: Insofar as different minds (e.g. different humans) tend to convergently model the same things as tools, what are the "real patterns" in the environment which give rise to that convergence? Consequences: Having characterized the real patterns convergently recognized as tool-ness, what other properties or consequences of tool-ness can we derive? What further predictions does our characterization make? We're not going to do any math in this post, though we will gesture at the spots where proofs or quantitative checks would ideally slot in. Cognition: What does it mean, cognitively, to view or model something as a tool? Let's start with a mental model of (the cognition of) problem solving, then we'll see how "tools" naturally fit into that mental model. When problem-solving, humans often come up with partial plans - i.e. plans which have "gaps" in them, which the human hasn't thought through how to solve, but expects to be tractable. For instance, if I'm planning a roadtrip from San Francisco to Las Vegas, a partial plan might look like "I'll take I-5 down the central valley, split off around Bakersfield through the Mojave, then get on the highway between LA and Vegas". That plan has a bunch of gaps in it: I'm not sure exactly what route I'll take out of San Francisco onto I-5 (including whether to go across or around the Bay), I don't know which specific exits to take in Bakersfield, I don't know where I'll stop for gas, I haven't decided whether I'll stop at the town museum in Boron, I might try to get pictures of the airplane storage or the solar thermal power plant, etc. But I expect those to be tractable problems which I can solve later, so it's totally fine for my plan to have such gaps in it. How do tools fit into that sort of problem-solving cognition? Well, sometimes similar gaps show up in many different plans (or many times in one plan). And if those gaps are similar enough, then it might be possible to solve them all "in the same way". Sometimes we can even build a physical object which makes it easy to solve a whole cluster of similar gaps. Consider a screwdriver, for instance. There's a whole broad class of problems for which my partial plans involve unscrewing screws. Those partial plans involve a bunch of similar "unscrew the screw" gaps, for which I usually don't think in advance about how I'll unscrew the screw, because I expect it to be tractable to solve that subproblem when the time comes. A screwdriver is a tool for that class of gaps/subproblems[2]. So here's our rough cognitive characterization: Humans naturally solve problems using partial plans which contain "gaps", i.e. subproblems which we put off solving until later Sometimes there are clusters of similar gaps A tool makes some such cluster relatively easy to solve. Convergence: Insofar as different minds (e.g. different humans) tend to convergently model the same things as tools, what are the "real patterns" in the environment which give rise to that convergence? First things first: there are limits to how much different minds do, in fact, convergently model the same things as tools. You know that thing where there's some weird object or class of objects, and you're not sure what it is or what it's for, but then one day you see somebody using it for its intended purpose and you're like "oh, that's what it's for"? () From this, we learn several things about tools: Insofar as different humans convergently model the same things as tools at...
undefined
Jun 25, 2024 • 12min

LW - Mistakes people make when thinking about units by Isaac King

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Mistakes people make when thinking about units, published by Isaac King on June 25, 2024 on LessWrong. This is a linkpost for Parker Dimensional Analysis. Probably a little elementary for LessWrong, but I think it may still contain a few novel insights, particularly in the last section about Verison's error. A couple years ago, there was an interesting clip on MSNBC. A few weeks later, Matt Parker came out with a video analyzing why people tend to make mistakes like this. Now I'm normally a huge fan of Matt Parker. But in this case, I think he kinda dropped the ball. He does have a very good insight. He realizes that people are treating the "million" as a unit, removing it from the numbers before performing the calculation, then putting it back on. This is indeed the proximate cause of the error. But Matt goes on to claim that the mistake is the treating of "million" as a unit; the implication being that, as a number suffix or a multiplier or however you want to think of it, it's not a unit, and therefore cannot be treated like one. This is false. So what is a unit, really? When we think of the term, we probably think of things like "meters", "degrees Celcius", "watts", etc.; sciency stuff. But I think the main reason we think of those is due to unit conversion; when you have to convert from meters to feet, or derive a force from mass and acceleration, this makes us very aware of the units being used, and we associate the concept of "unit" with this sort of physics conversion. In reality, a unit is just "what kind of thing you're counting". Matt uses two other examples in his video: "dollars" and "sheep". Both of these are perfectly valid units! If I say "50 meters", that's just applying the number "50" to the thing "meters", saying that you have 50 of that thing. "50 sheep" works exactly the same way. So what about "millions"? Well, we can definitely count millions! 1 million, 2 million, etc. You could imagine making physical groupings of a million sheep at a time, perhaps using some very large rubber bands, and then counting up individual clusters. "Millions" is a unit![1] So if millions is a perfectly valid unit, why do we get an incorrect result if we take it off and then put it back on again after the calculation? Well, because you can't do that with other units either! 100 watts divided by 20 watts does not equal 5 watts. It equals the number 5, with no unit. This is a somewhat subtle distinction, and easy to miss in a casual conversation. But it makes sense when you think about the actual things you're counting. 50 sheep is certainly not the same thing as 50 horses. And 50 sheep is also not the same thing as the abstract number 50; one is a group of animals, the other a mathematical concept. If someone were to say something to you involving the number 50, you would not simply assume that they're talking about sheep. This perfectly solves the problem. If 100 watts / 20 watts equals only the number 5, with no "watts", then 100 million / 20 million also equals only the number 5, with no "million". But what about Matt's example? 80 million sheep - 50 million sheep = 30 million sheep; not just 30. That's because this is subtraction, not division. Units work differently depending on what operation you're performing! If you're doing addition or subtraction, the units are preserved; you can take them off at the beginning and then put them back on at the end. But for multiplication and division, this is not the case. Division cancels out the units, removing them entirely, and multiplication gives you a new unit, equal to the previous unit squared. This seems kind of arbitrary, right? Why do they work differently depending on the operation? To understand this, let's go back to a different example that Matt used in his video. Near the beginning, when he's performing the ...
undefined
Jun 25, 2024 • 4min

LW - I'm a bit skeptical of AlphaFold 3 by Oleg Trott

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: I'm a bit skeptical of AlphaFold 3, published by Oleg Trott on June 25, 2024 on LessWrong. (also on https://olegtrott.substack.com) So this happened: DeepMind (with 48 authors, including a new member of the British nobility) decided to compete with me. Or rather, with some of my work from 10+ years ago. Apparently, AlphaFold 3 can now predict how a given drug-like molecule will bind to its target protein. And it does so better than AutoDock Vina (the most cited molecular docking program, which I built at Scripps Research): On top of this, it doesn't even need a 3D structure of the target. It predicts it too! But I'm a bit skeptical. I'll try to explain why. Consider a hypothetical scientific dataset where all data is duplicated: Perhaps the scientists had trust issues and tried to check each others' work. Suppose you split this data randomly into training and test subsets at a ratio of 3-to-1, as is often done: Now, if all your "learning" algorithm does is memorize the training data, it will be very easy for it to do well on 75% of the test data, because 75% of the test data will have copies in the training data. Scientists mistrusting each other are only one source of data redundancy, by the way. Different proteins can also be related to each other. Even when the sequence similarity between two proteins is low, because of evolutionary pressures, this similarity tends to be concentrated where it matters, which is the binding site. Lastly, scientists typically don't just take random proteins and random drug-like molecules, and try to determine their combined structures. Oftentimes, they take baby steps, choosing to study drug-like molecules similar to the ones already discovered for the same or related targets. So there can be lots of redundancy and near-redundancy in the public 3D data of drug-like molecules and proteins bound together. Long ago, when I was a PhD student at Columbia, I trained a neural network to predict protein flexibility. The dataset I had was tiny, but it had interrelated proteins already: With a larger dataset, due to the Birthday Paradox, the interrelatedness would have probably been a much bigger concern. Back then, I decided that using a random train-test split would have been wrong. So I made sure that related proteins were never in both "train" and "test" subsets at the same time. With my model, I was essentially saying "Give me a protein, and (even) if it's unrelated to the ones in my training data, I can predict …" The authors don't seem to do that. Their analysis reports that most of the proteins in the test dataset had kin in the training dataset with sequence identity in the 95-100 range. Some had sequence identity below 30, but I wonder if this should really be called "low": This makes it hard to interpret. Maybe the results tell us something about the model's ability to learn how molecules interact. Or maybe they tell us something about the redundancy of 3D data that people tend to deposit? Or some combination? Docking software is used to scan millions and billions of drug-like molecules looking for new potential binders. So it needs to be able to generalize, rather than just memorize. But the following bit makes me really uneasy. The authors say: The second class of stereochemical violations is a tendency of the model to occasionally produce overlapping (clashing) atoms in the predictions. This sometimes manifests as extreme violations in homomers in which entire chains have been observed to overlap (Fig. 5e). If AlphaFold 3 is actually learning any non-obvious insights from data, about how molecules interact, why is it missing possibly the most obvious one of them all, which is that interpenetrating atoms are bad? On the other hand, if most of what it does is memorize and regurgitate data (when it can), this would explain such fail...
undefined
Jun 25, 2024 • 59min

LW - Book Review: Righteous Victims - A History of the Zionist-Arab Conflict by Yair Halberstadt

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Book Review: Righteous Victims - A History of the Zionist-Arab Conflict, published by Yair Halberstadt on June 25, 2024 on LessWrong. I originally entered this to the ACX Book Review competition. Since it has not been selected as a finalist I'm now free to post it here. In truth it's a followup to my review of Morris's history of Israel's War of Independence. In the wake of the October 7th attack on Israel and Israel's response, everyone seemed to agree that one side of the conflict was the epitome of evil, the reincarnation of the Nazis, with warfare in their blood and a pure unfiltered hatred of the enemy in their minds. The other side was a force for good, who just wanted peace and was doing the best they could in a difficult situation. The only problem is no one could agree which side was which. This is unfair. While the loudest voices may paint the world in black and white, as soon as you ignore them, you begin to encounter a whole range of more nuanced views - yet still find yourself no less confused. Now for the most part my view is that unless you're willing to put in the effort to deeply understand conflicts in far off lands, you're best off not having an opinion on them, and definitely not one fed to you by the twitter or tiktok feed. Expressing loud, confident opinions on unfamiliar conflicts often does more harm than good. Alas this conflict is not in a far away land. I live 20km from the border with Gaza. Most of my friends were called up to do reserve duty in the IDF. My children almost certainly will have to do the same once they grow up. Far too much of my income goes towards military spending rather than my bank account. I can't take the easy way out, so I have to do things the hard way. So I bought a copy of Benny Morris's Righteous Victims at exorbitant cost[1], and plowed through it. And I thought I'd share with you what I learned, so that if you do decide to opine on the Israel Palestine conflict, your opinion will hopefully be more educated. Righteous Victims is a history of the Arab Zionist conflict from 1881 till 2001, written by one of the most respected historians of this conflict. Bias Morris is a liberal Zionist, but one whose aim in studying history was to strip back the comforting lies he'd been taught as a child, and find out the actual truth. None of his (serious) critics accuse him of lying, and his mastery of the primary sources is undisputed. Instead there are two main accusations leveled against him. The first he readily admits himself in the introduction. Almost all sources about this conflict come from British or Israeli archives. Arab literacy was far lower, Arab historiography of this conflict is a relatively new and small field, and Arab documents have for the most part not been made publicly available even when they exist. Meanwhile a wealth of Zionist material has been released to the public, and we have plenty of contemporary documents to rely on. While he tries to decipher the Arab perspective from the Zionist one, and relies on Arab documents when they are available, this is naturally going to be both a blindspot and a source of systematic bias. The second is in choosing which events to highlight and which to ignore. This is an impossible task - over 120 years the amount of relevant information is going to outweigh by many orders of magnitude the amount of space you have in your book, and by carefully selecting which facts to tell you can paint any story you like without ever actually lying. In practice you deal with this by covering the most important[2] events in plenty of detail, picking representative examples of other events, and giving aggregate statistics[3] to place the representative sample in context. However hard one tries here, it's always possible to accuse the author of favoring facts which paint one side or...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app