The Nonlinear Library: LessWrong

The Nonlinear Fund
undefined
May 23, 2024 • 8min

LW - "Which chains-of-thought was that faster than?" by Emrik

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: "Which chains-of-thought was that faster than?", published by Emrik on May 23, 2024 on LessWrong. Here's some good advice from Eliezer: TAP: "How could I have thought that faster?" WHEN[1] you complete a chain-of-thought THEN ask yourself, "how could I have thought that faster?" I really like this heuristic, and it's already paid its rent several times over for me. Most recently today, so I'll share the (slightly edited) cognitive trace of it as an example: Example: To find the inverse of something, trace the chain forward a few times first 1. I was in the context of having just asked myself "what's the set of functions which have this function as its derivative?" 2. This is of course its integral, but I didn't want to use cached abstractions, and instead sought to get a generalized view of the landscape from first-principles. 3. For about ~10 seconds, I tried to hold the function f in my mind while trying to directly generate the integral landscape from it. 4. This seemed awfwly inefficient, so I changed tack: I already know some specific functions whose derivatives equal f, so I held those as the proximal thing in my mind while retracing the cognitive steps involved in their derivation. 5. After making those steps more salient in the forward direction (integralderivative), it was easier to retrace the path in the opposite direction. 6. And once the derivativeintegral trace was salient for a few examples, it was easier to generalize from the examples to produce the landscape of all the integrals. 7. There are multiple takeaways here, but one is: 1. "If you struggle to generalize something, find a way to generate specific examples first, then generalize from the examples." TAP: "Which chains-of-thought was that faster than?" Imo, more important than asking "how could I have thought that faster?" is the inverse heuristic: WHEN you complete a good chain-of-thought THEN ask yourself, "which chains-of-thought was that faster than?" Although, ideally, I wouldn't scope the trigger to every time you complete a thought, since that overburdens the general cue. Instead, maybe limit it to those times when you have an especially clear trace of it AND you have a hunch that something about it was unusually good. WHEN you complete a good chain of thought AND you have its trace in short-term memory AND you hunch that something about it was unusually effective THEN ask yourself, "which chains-of-thought was that faster than?" Example: Sketching out my thoughts with pen-and-paper 1. Yesterday I was writing out some plans explicitly with pen and paper - enumerating my variables and drawing arrows between them. 2. I noticed - for the umpteenth time - that forcing myself to explicitly sketch out the problem (even with improvised visualizations) is far more cognitively ergonomic than keeping it in my head (see eg why you should write pseudocode). 3. But instead of just noting "yup, I should force myself to do more pen-and-paper", I asked myself two questions: 1. "When does it help me think, and when does it just slow me down?" 1. This part is important: scope your insight sharply to contexts where it's usefwl - hook your idea into the contexts where you want it triggered - so you avoid wasting memory-capacity on linking it up to useless stuff. 2. In other words, you want to minimize (unwanted) associative interference so you can remember stuff at lower cost. 3. My conclusion was that pen-and-paper is good when I'm trying to map complex relations between a handfwl of variables. 4. And it is NOT good when I have just a single proximal idea that I want to compare against a myriad of samples with high false-positive rate - that's instead where I should be doing inside-head thinking to exploit the brain's massively parallel distributed processor. 2. "Why am I so reluctant to do it?" 1. This se...
undefined
May 22, 2024 • 25min

LW - Do Not Mess With Scarlett Johansson by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Do Not Mess With Scarlett Johansson, published by Zvi on May 22, 2024 on LessWrong. I repeat. Do not mess with Scarlett Johansson. You would think her movies, and her suit against Disney, would make this obvious. Apparently not so. Andrej Karpathy (co-founder OpenAI, departed earlier), May 14: The killer app of LLMs is Scarlett Johansson. You all thought it was math or something. You see, there was this voice they created for GPT-4o, called 'Sky.' People noticed it sounded suspiciously like Scarlett Johansson, who voiced the AI in the movie Her, which Sam Altman says is his favorite movie of all time, which he says inspired OpenAI 'more than a little bit,' and then he tweeted "Her" on its own right before the GPT-4o presentation, and which was the comparison point for many people reviewing the GPT-4o debut? Quite the Coincidence I mean, surely that couldn't have been intentional. Oh, no. Kylie Robison: I asked Mira Mutari about Scarlett Johansson-type voice in today's demo of GPT-4o. She clarified it's not designed to mimic her, and said someone in the audience asked this exact same question! Kylie Robison in Verge (May 13): Title: ChatGPT will be able to talk to you like Scarlett Johansson in Her. OpenAI reports on how it created and selected its five selected GPT-4o voices. OpenAI: We support the creative community and worked closely with the voice acting industry to ensure we took the right steps to cast ChatGPT's voices. Each actor receives compensation above top-of-market rates, and this will continue for as long as their voices are used in our products. We believe that AI voices should not deliberately mimic a celebrity's distinctive voice - Sky's voice is not an imitation of Scarlett Johansson but belongs to a different professional actress using her own natural speaking voice. To protect their privacy, we cannot share the names of our voice talents. … Looking ahead, you can expect even more options as we plan to introduce additional voices in ChatGPT to better match the diverse interests and preferences of users. Jessica Taylor: My "Sky's voice is not an imitation of Scarlett Johansson" T-shirt has people asking a lot of questions already answered by my shirt. OpenAI: We've heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them. Variety: Altman said in an interview last year that "Her" is his favorite movie. Variety: OpenAI Suspends ChatGPT Voice That Sounds Like Scarlett Johansson in 'Her': AI 'Should Not Deliberately Mimic a Celebrity's Distinctive Voice.' [WSJ had similar duplicative coverage.] Flowers from the Future: That's why we can't have nice things. People bore me. Again: Do not mess with Scarlett Johansson. She is Black Widow. She sued Disney. Several hours after compiling the above, I was happy to report that they did indeed mess with Scarlett Johansson. She is pissed. Bobby Allen (NPR): Scarlett Johansson says she is 'shocked, angered' over new ChatGPT voice. … Johansson's legal team has sent OpenAI two letters asking the company to detail the process by which it developed a voice the tech company dubbed "Sky," Johansson's publicist told NPR in a revelation that has not been previously reported. NPR then published her statement, which follows. Scarlett Johansson's Statement Scarlett Johansson: Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and Al. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, I declined the offer. Nine months later, my friends,...
undefined
May 22, 2024 • 6min

LW - Anthropic announces interpretability advances. How much does this advance alignment? by Seth Herd

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Anthropic announces interpretability advances. How much does this advance alignment?, published by Seth Herd on May 22, 2024 on LessWrong. Anthropic just published a pretty impressive set of results in interpretability. This raises for me, some questions and a concern: Interpretability helps, but it isn't alignment, right? It seems to me as though the vast bulk of alignment funding is now going to interpretability. Who is thinking about how to leverage interpretability into alignment? It intuitively seems as though we are better off the more we understand the cognition of foundation models. I think this is true, but there are sharp limits: it will be impossible to track the full cognition of an AGI, and simply knowing what it's thinking about will be inadequate to know whether it's making plans you like. One can think about bioweapons, for instance, to either produce them or prevent producing them. More on these at the end; first a brief summary of their results. In this work, they located interpretable features in Claude 3 Sonnet using sparse autoencoders, and manipulating model behavior using those features as steering vectors. They find features for subtle concepts; they highlight features for: The Golden Gate Bridge 34M/31164353: Descriptions of or references to the Golden Gate Bridge. Brain sciences 34M/9493533: discussions of neuroscience and related academic research on brains or minds. Monuments and popular tourist attractions 1M/887839. Transit infrastructure 1M/3. [links to examples] ... We also find more abstract features - responding to things like bugs in computer code, discussions of gender bias in professions, and conversations about keeping secrets. ...we found features corresponding to: Capabilities with misuse potential (code backdoors, developing biological weapons) Different forms of bias (gender discrimination, racist claims about crime) Potentially problematic AI behaviors (power-seeking, manipulation, secrecy) Presumably, the existence of such features will surprise nobody who's used and thought about large language models. It is difficult to imagine how they would do what they do without using representations of subtle and abstract concepts. They used the dictionary learning approach, and found distributed representations of features: Our general approach to understanding Claude 3 Sonnet is based on the linear representation hypothesis and the superposition hypothesis from the publication, Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet. Or to put it more plainly: It turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts. Representations in the brain definitely follow that description, and the structure of representations seems pretty similar as far as we can guess from animal studies and limited data on human language use. They also include a fascinating image of near neighbors to the feature for internal conflict (see header image). So, back to the broader question: it is clear how this type of interpretability helps with AI safety: being able to monitor when it's activating features for things like bioweapons, and use those features as steering vectors, can help control the model's behavior. It is not clear to me how this generalizes to AGI. And I am concerned that too few of us are thinking about this. It seems pretty apparent how detecting lying will dramatically help in pretty much any conceivable plan for technical alignment of AGI. But it seems like being able to monitor an entire thought process of a being smarter than us is impossible on the face of it. I think the hope is that we can detect and monitor cognition that is about dangerous topics, so we don't need to follow its full train of thought. If we can tell what an AGI is thinking ...
undefined
May 21, 2024 • 33min

LW - On Dwarkesh's Podcast with OpenAI's John Schulman by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On Dwarkesh's Podcast with OpenAI's John Schulman, published by Zvi on May 21, 2024 on LessWrong. Dwarkesh Patel recorded a Podcast with John Schulman, cofounder of OpenAI and at the time their head of current model post-training. Transcript here. John's job at the time was to make the current AIs do what OpenAI wanted them to do. That is an important task, but one that employs techniques that their at-the-time head of alignment, Jan Leike, made clear we should not expect to work on future more capable systems. I strongly agree with Leike on that. Then Sutskever left and Leike resigned, and John Schulman was made the new head of alignment, now charged with what superalignment efforts remain at OpenAI to give us the ability to control future AGIs and ASIs. This gives us a golden opportunity to assess where his head is at, without him knowing he was about to step into that role. There is no question that John Schulman is a heavyweight. He executes and ships. He knows machine learning. He knows post-training and mundane alignment. The question is, does he think well about this new job that has been thrust upon him? The Big Take Overall I was pleasantly surprised and impressed. In particular, I was impressed by John's willingness to accept uncertainty and not knowing things. He does not have a good plan for alignment, but he is far less confused about this fact than most others in similar positions. He does not know how to best navigate the situation if AGI suddenly happened ahead of schedule in multiple places within a short time frame, but I have not ever heard a good plan for that scenario, and his speculations seem about as directionally correct and helpful as one could hope for there. Are there answers that are cause for concern, and places where he needs to fix misconceptions as quickly as possible? Oh, hell yes. His reactions to potential scenarios involved radically insufficient amounts of slowing down, halting and catching fire, freaking out and general understanding of the stakes. Some of that I think was about John and others at OpenAI using a very weak definition of AGI (perhaps partly because of the Microsoft deal?) but also partly he does not seem to appreciate what it would mean to have an AI doing his job, which he says he expects in a median of five years. His answer on instrumental convergence is worrisome, as others have pointed out. He dismisses concerns that an AI given a bounded task would start doing things outside the intuitive task scope, or the dangers of an AI 'doing a bunch of wacky things' a human would not have expected. On the plus side, it shows understanding of the key concepts on a basic (but not yet deep) level, and he readily admits it is an issue with commands that are likely to be given in practice, such as 'make money.' In general, he seems willing to react to advanced capabilities by essentially scaling up various messy solutions in ways that I predict would stop working at that scale or with something that outsmarts you and that has unanticipated affordances and reason to route around typical in-distribution behaviors. He does not seem to have given sufficient thought to what happens when a lot of his assumptions start breaking all at once, exactly because the AI is now capable enough to be properly dangerous. As with the rest of OpenAI, another load-bearing assumption is presuming gradual changes throughout all this, including assuming past techniques will not break. I worry that will not hold. He has some common confusions about regulatory options and where we have viable intervention points within competitive dynamics and game theory, but that's understandable, and also was at the time very much not his department. As with many others, there seems to be a disconnect. A lot of the thinking here seems like excellent practical thi...
undefined
May 21, 2024 • 12min

LW - New voluntary commitments (AI Seoul Summit) by Zach Stein-Perlman

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: New voluntary commitments (AI Seoul Summit), published by Zach Stein-Perlman on May 21, 2024 on LessWrong. Basically the companies commit to make responsible scaling policies. Part of me says this is amazing, the best possible commitment short of all committing to a specific RSP. It's certainly more real than almost all other possible kinds of commitments. But as far as I can tell, people pay almost no attention to what RSP-ish documents (Anthropic, OpenAI, Google) actually say and whether the companies are following them. The discourse is more like "Anthropic, OpenAI, and Google have safety plans and other companies don't." Hopefully that will change. Maybe "These commitments represent a crucial and historic step forward for international AI governance." It does seem nice from an international-governance perspective that Mistral AI, TII, and a Chinese company joined. The UK and Republic of Korea governments announced that the following organisations have agreed to the Frontier AI Safety Commitments: Amazon Anthropic Cohere Google G42 IBM Inflection AI Meta Microsoft Mistral AI Naver OpenAI Samsung Electronics Technology Innovation Institute xAI Zhipu.ai The above organisations, in furtherance of safe and trustworthy AI, undertake to develop and deploy their frontier AI models and systems[1] responsibly, in accordance with the following voluntary commitments, and to demonstrate how they have achieved this by publishing a safety framework focused on severe risks by the upcoming AI Summit in France. Given the evolving state of the science in this area, the undersigned organisations' approaches (as detailed in paragraphs I-VIII) to meeting Outcomes 1, 2 and 3 may evolve in the future. In such instances, organisations will provide transparency on this, including their reasons, through public updates. The above organisations also affirm their commitment to implement current best practices related to frontier AI safety, including: internal and external red-teaming of frontier AI models and systems for severe and novel threats; to work toward information sharing; to invest in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights; to incentivize third-party discovery and reporting of issues and vulnerabilities; to develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated; to publicly report model or system capabilities, limitations, and domains of appropriate and inappropriate use; to prioritize research on societal risks posed by frontier AI models and systems; and to develop and deploy frontier AI models and systems to help address the world's greatest challenges. Outcome 1. Organisations effectively identify, assess and manage risks when developing and deploying their frontier AI models and systems. They will: I. Assess the risks posed by their frontier models or systems across the AI lifecycle, including before deploying that model or system, and, as appropriate, before and during training. Risk assessments should consider model capabilities and the context in which they are developed and deployed, as well as the efficacy of implemented mitigations to reduce the risks associated with their foreseeable use and misuse. They should also consider results from internal and external evaluations as appropriate, such as by independent third-party evaluators, their home governments[2], and other bodies their governments deem appropriate. II. Set out thresholds[3] at which severe risks posed by a model or system, unless adequately mitigated, would be deemed intolerable. Assess whether these thresholds have been breached, including monitoring how close a model or system is to such a breach. These thresholds should be defined with input from trusted actors, including organisations' respective ho...
undefined
May 21, 2024 • 3min

LW - [Linkpost] Statement from Scarlett Johansson on OpenAI's use of the "Sky" voice, that was shockingly similar to her own voice. by Linch

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Linkpost] Statement from Scarlett Johansson on OpenAI's use of the "Sky" voice, that was shockingly similar to her own voice., published by Linch on May 21, 2024 on LessWrong. Scarlett Johansson makes a statement about the "Sky" voice, a voice for GPT-4o that OpenAI recently pulled after less than a week of prime time. tl;dr: OpenAI made an offer last September to Johansson; she refused. They offered again 2 days before the public demo. Scarlett Johansson claims that the voice was so similar that even friend and family noticed. She hired legal counsel to ask OpenAI to "detail the exact process by which they created the 'Sky' voice," which resulted in OpenAI taking the voice down. Full statement below: Last September, I received an offer from Sam Altman, who wanted to hire me to voice the current ChatGPT 4.0 system. He told me that he felt that by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with the seismic shift concerning humans and Al. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, declined the offer. Nine months later, my friends, family and the general public all noted how much the newest system named 'Sky' sounded like me. When I heard the released demo, I was shocked, angered and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine that my closest friends and news outlets could not tell the difference. Mr. Altman even insinuated that the similarity was intentional, tweeting a single word 'her' - a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human. Two days before the ChatGPT 4.0 demo was released, Mr. Altman contacted my agent, asking me to reconsider. Before we could connect, the system was out there. As a result of their actions, I was forced to hire legal counsel, who wrote two letters to Mr. Altman and OpenAl, setting out what they had done and asking them to detail the exact process by which they created the 'Sky' voice. Consequently, OpenAl reluctantly agreed to take down the 'Sky' voice. In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency and the passage of appropriate legislation to help ensure that individual rights are protected. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
May 20, 2024 • 17min

LW - Anthropic: Reflections on our Responsible Scaling Policy by Zac Hatfield-Dodds

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Anthropic: Reflections on our Responsible Scaling Policy, published by Zac Hatfield-Dodds on May 20, 2024 on LessWrong. Last September we published our first Responsible Scaling Policy (RSP) [LW discussion], which focuses on addressing catastrophic safety failures and misuse of frontier models. In adopting this policy, our primary goal is to help turn high-level safety concepts into practical guidelines for fast-moving technical organizations and demonstrate their viability as possible standards. As we operationalize the policy, we expect to learn a great deal and plan to share our findings. This post shares reflections from implementing the policy so far. We are also working on an updated RSP and will share this soon. We have found having a clearly-articulated policy on catastrophic risks extremely valuable. It has provided a structured framework to clarify our organizational priorities and frame discussions around project timelines, headcount, threat models, and tradeoffs. The process of implementing the policy has also surfaced a range of important questions, projects, and dependencies that might otherwise have taken longer to identify or gone undiscussed. Balancing the desire for strong commitments with the reality that we are still seeking the right answers is challenging. In some cases, the original policy is ambiguous and needs clarification. In cases where there are open research questions or uncertainties, setting overly-specific requirements is unlikely to stand the test of time. That said, as industry actors face increasing commercial pressures we hope to move from voluntary commitments to established best practices and then well-crafted regulations. As we continue to iterate on and improve the original policy, we are actively exploring ways to incorporate practices from existing risk management and operational safety domains. While none of these domains alone will be perfectly analogous, we expect to find valuable insights from nuclear security, biosecurity, systems safety, autonomous vehicles, aerospace, and cybersecurity. We are building an interdisciplinary team to help us integrate the most relevant and valuable practices from each. Our current framework for doing so is summarized below, as a set of five high-level commitments. 1. Establishing Red Line Capabilities. We commit to identifying and publishing "Red Line Capabilities" which might emerge in future generations of models and would present too much risk if stored or deployed under our current safety and security practices (referred to as the ASL-2 Standard). 2. Testing for Red Line Capabilities (Frontier Risk Evaluations). We commit to demonstrating that the Red Line Capabilities are not present in models, or - if we cannot do so - taking action as if they are (more below). This involves collaborating with domain experts to design a range of "Frontier Risk Evaluations" - empirical tests which, if failed, would give strong evidence against a model being at or near a red line capability. We also commit to maintaining a clear evaluation process and a summary of our current evaluations publicly. 3. Responding to Red Line Capabilities. We commit to develop and implement a new standard for safety and security sufficient to handle models that have the Red Line Capabilities. This set of measures is referred to as the ASL-3 Standard. We commit not only to define the risk mitigations comprising this standard, but also detail and follow an assurance process to validate the standard's effectiveness. Finally, we commit to pause training or deployment if necessary to ensure that models with Red Line Capabilities are only trained, stored and deployed when we are able to apply the ASL-3 standard. 4. Iteratively extending this policy. Before we proceed with activities which require the ASL-3 standard, we commit...
undefined
May 20, 2024 • 1h 9min

LW - OpenAI: Exodus by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: OpenAI: Exodus, published by Zvi on May 20, 2024 on LessWrong. Previously: OpenAI: Facts From a Weekend, OpenAI: The Battle of the Board, OpenAI: Leaks Confirm the Story, OpenAI: Altman Returns, OpenAI: The Board Expands. Ilya Sutskever and Jan Leike have left OpenAI. This is almost exactly six months after Altman's temporary firing and The Battle of the Board, the day after the release of GPT-4o, and soon after a number of other recent safety-related OpenAI departures. Many others working on safety have also left recently. This is part of a longstanding pattern at OpenAI. Jan Leike later offered an explanation for his decision on Twitter. Leike asserts that OpenAI has lost the mission on safety and culturally been increasingly hostile to it. He says the superalignment team was starved for resources, with its public explicit compute commitments dishonored, and that safety has been neglected on a widespread basis, not only superalignment but also including addressing the safety needs of the GPT-5 generation of models. Altman acknowledged there was much work to do on the safety front. Altman and Brockman then offered a longer response that seemed to say exactly nothing new. Then we learned that OpenAI has systematically misled and then threatened its departing employees, forcing them to sign draconian lifetime non-disparagement agreements, which they are forbidden to reveal due to their NDA. Altman has to some extent acknowledged this and promised to fix it once the allegations became well known, but so far there has been no fix implemented beyond an offer to contact him privately for relief. These events all seem highly related. Also these events seem quite bad. What is going on? This post walks through recent events and informed reactions to them. The first ten sections address departures from OpenAI, especially Sutskever and Leike. The next five sections address the NDAs and non-disparagement agreements. Then at the end I offer my perspective, highlight another, and look to paths forward. Table of Contents 1. The Two Departure Announcements 2. Who Else Has Left Recently? 3. Who Else Has Left Overall? 4. Early Reactions to the Departures 5. The Obvious Explanation: Altman 6. Jan Leike Speaks 7. Reactions After Lekie's Statement 8. Greg Brockman and Sam Altman Respond to Leike 9. Reactions from Some Folks Unworried About Highly Capable AI 10. Don't Worry, Be Happy? 11. The Non-Disparagement and NDA Clauses 12. Legality in Practice 13. Implications and Reference Classes 14. Altman Responds on Non-Disparagement Clauses 15. So, About That Response 16. How Bad Is All This? 17. Those Who Are Against These Efforts to Prevent AI From Killing Everyone 18. What Will Happen Now? 19. What Else Might Happen or Needs to Happen Now? The Two Departure Announcements Here are the full announcements and top-level internal statements made on Twitter around the departures of Ilya Sutskever and Jan Leike. Ilya Sutskever: After almost a decade, I have made the decision to leave OpenAI. The company's trajectory has been nothing short of miraculous, and I'm confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama, @gdb, @miramurati and now, under the excellent research leadership of Jakub Pachocki. It was an honor and a privilege to have worked together, and I will miss everyone dearly. So long, and thanks for everything. I am excited for what comes next - a project that is very personally meaningful to me about which I will share details in due time. [Ilya then shared the photo below] Jakub Pachocki: Ilya introduced me to the world of deep learning research, and has been a mentor to me, and a great collaborator for many years. His incredible vision for what deep learning could become was foundational to what OpenAI, and the field of AI, is today. I...
undefined
May 20, 2024 • 46sec

LW - Jaan Tallinn's 2023 Philanthropy Overview by jaan

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Jaan Tallinn's 2023 Philanthropy Overview, published by jaan on May 20, 2024 on LessWrong. to follow up my philantropic pledge from 2020, i've updated my philanthropy page with 2023 results. in 2023 my donations funded $44M worth of endpoint grants ($43.2M excluding software development and admin costs) - exceeding my commitment of $23.8M (20k times $1190.03 - the minimum price of ETH in 2023). Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
undefined
May 19, 2024 • 25min

LW - International Scientific Report on the Safety of Advanced AI: Key Information by Aryeh Englander

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: International Scientific Report on the Safety of Advanced AI: Key Information, published by Aryeh Englander on May 19, 2024 on LessWrong. I thought that the recently released International Scientific Report on the Safety of Advanced AI seemed like a pretty good summary of the state of the field on AI risks, in addition to being about as close to a statement of expert consensus as we're likely to get at this point. I noticed that each section of the report has a useful "Key Information" bit with a bunch of bullet points summarizing that section. So for my own use as well as perhaps the use of others, and because I like bullet-point summaries, I've copy-pasted all the "Key Information" lists here. 1 Introduction [Bullet points taken from the "About this report" part of the Executive Summary] This is the interim publication of the first 'International Scientific Report on the Safety of Advanced AI'. A diverse group of 75 artificial intelligence (AI) experts contributed to this report, including an international Expert Advisory Panel nominated by 30 countries, the European Union (EU), and the United Nations (UN). Led by the Chair of this report, the independent experts writing this report collectively had full discretion over its content. At a time of unprecedented progress in AI development, this first publication restricts its focus to a type of AI that has advanced particularly rapidly in recent years: General-purpose AI, or AI that can perform a wide variety of tasks. Amid rapid advancements, research on general-purpose AI is currently in a time of scientific discovery and is not yet settled science. People around the world will only be able to enjoy general-purpose AI's many potential benefits safely if its risks are appropriately managed. This report focuses on identifying these risks and evaluating technical methods for assessing and mitigating them. It does not aim to comprehensively assess all possible societal impacts of general-purpose AI, including its many potential benefits. For the first time in history, this interim report brought together experts nominated by 30 countries, the EU, and the UN, and other world-leading experts, to provide a shared scientific, evidence-based foundation for discussions and decisions about general-purpose AI safety. We continue to disagree on several questions, minor and major, around general-purpose AI capabilities, risks, and risk mitigations. But we consider this project essential for improving our collective understanding of this technology and its potential risks, and for moving closer towards consensus and effective risk mitigation to ensure people can experience the potential benefits of general-purpose AI safely. The stakes are high. We look forward to continuing this effort. 2 Capabilities 2.1 How does General-Purpose AI gain its capabilities? General-purpose AI models and systems can produce text, images, video, labels for unlabelled data, and initiate actions. The lifecycle of general-purpose AI models and systems typically involves computationally intensive 'pre-training', labour-intensive 'fine-tuning', and continual post-deployment monitoring and updates. There are various types of general-purpose AI. Examples of general-purpose AI models include: Chatbot-style language models, such as GPT-4, Gemini-1.5, Claude-3, Qwen1.5, Llama-3, and Mistral Large. Image generators, such as DALLE-3, Midjourney-5, and Stable Diffusion-3. Video generators such as SORA. Robotics and navigation systems, such as PaLM-E. Predictors of various structures in molecular biology such as AlphaFold 3. 2.2 What current general-purpose AI systems are capable of General-purpose AI capabilities are difficult to estimate reliably but most experts agree that current general-purpose AI capabilities include: Assisting programmers and writing short ...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app