The Nonlinear Library: LessWrong

The Nonlinear Fund

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

Episodes

Mentioned books

Jun 16, 2024 • 15min

LW - CIV: a story by Richard Ngo

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: CIV: a story, published by Richard Ngo on June 16, 2024 on LessWrong. The room was cozy despite its size, with wood-lined walls reflecting the dim lighting. At one end, a stone fireplace housed a roaring fire; in the middle stood a huge oak table. The woman seated at the head of it rapped her gavel. "I hereby call to order the first meeting of the Parliamentary Subcommittee on Intergalactic Colonization. We'll start with brief opening statements, for which each representative will be allocated one minute, including - " "Oh, enough with the pomp, Victoria. It's just the four of us." The representative for the Liberal Democrats waved his hand around the nearly-empty room. Victoria sniffed. "It's important, Stuart. This is a decision that will have astronomical implications. And it's recorded, besides, so we should do things by the book. Carla, you're up first." The woman at the end of the table stood with a smile. "Thank you, Victoria. I'm speaking on behalf of the Labour party, and I want to start by reminding you all of our place in history. We stand here in a world that has been shaped by centuries of colonialism. Now we're considering another wave of colonization, this one far vaster in scale. We need to - " "Is this just a linguistic argument?" the fourth person at the table drawled. "We can call it something different if that would make you feel better. Say, universe settlement." "Like the settlements in Palestine?" "Oh, come on, Carla." "No, Milton, this is a crucial point. We're talking about the biggest power grab the world has ever seen. You think Leopold II was bad when he was in charge of the Congo? Imagine what people will do if you give each of them total power over a whole solar system! Even libertarians like you have to admit it would be a catastrophe. If there's any possibility that we export oppression from earth across the entire universe, we should burn the rockets and stay home instead." "Okay, thank you Carla," Victoria cut in. "That's time. Stuart, you're up next." Stuart stood. "Speaking on behalf of the Liberal Democrats, I have to admit this is a tricky one. The only feasible way to send humans out to other galaxies is as uploaded minds, but many of our usual principles break for them. I want civilization to be democratic, but what does 'one person one vote' even mean when people can copy and paste themselves? I want human rights for all, but what do human rights even mean when you can just engineer minds who don't want those rights?" "So as much as I hate the idea of segregating civilization, I think it's necessary. Biological humans should get as much territory as we will ever use. But realistically, given the lightspeed constraint, we're never going to actually want to leave the Milky Way. Then the rest of the Virgo Supercluster should be reserved for human uploads. Beyond that, anything else we can reach we should fill with as much happiness and flourishing as possible, no matter how alien it seems to us. After all, as our esteemed predecessor John Stuart Mill once said…" He frowned, and paused for a second. "...as he said, the sole objective of government should be the greatest good for the greatest number." Stuart sat, looking a little disquieted. "Thank you, Stuart. I'll make my opening statement next." Victoria stood and leaned forward, sweeping her eyes across the others. "I'm here representing the Conservatives. It's tempting to think that we can design a good society with just the right social engineering, just the right nudges. But the one thing we conservatives know for sure is: it won't work. Whatever clever plan you come up with, it won't be stable. Given the chance, people will push towards novelty and experimentation and self-modification, and the whole species will end up drifting towards something alien and inhuman. "Hard ru...

Jun 15, 2024 • 5min

LW - MIRI's June 2024 Newsletter by Harlan

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: MIRI's June 2024 Newsletter, published by Harlan on June 15, 2024 on LessWrong. MIRI updates MIRI Communications Manager Gretta Duleba explains MIRI's current communications strategy. We hope to clearly communicate to policymakers and the general public why there's an urgent need to shut down frontier AI development, and make the case for installing an "off-switch". This will not be easy, and there is a lot of work to be done. Some projects we're currently exploring include a new website, a book, and an online reference resource. Rob Bensinger argues, contra Leopold Aschenbrenner, that the US government should not race to develop artificial superintelligence. "If anyone builds it, everyone dies." Instead, Rob outlines a proposal for the US to spearhead an international alliance to halt progress toward the technology. At the end of June, the Agent Foundations team, including Scott Garrabrant and others, will be parting ways with MIRI to continue their work as independent researchers. The team was originally set up and "sponsored" by Nate Soares and Eliezer Yudkowsky. However, as AI capabilities have progressed rapidly in recent years, Nate and Eliezer have become increasingly pessimistic about this type of work yielding significant results within the relevant timeframes. Consequently, they have shifted their focus to other priorities. Senior MIRI leadership explored various alternatives, including reorienting the Agent Foundations team's focus and transitioning them to an independent group under MIRI fiscal sponsorship with restricted funding, similar to AI Impacts. Ultimately, however, we decided that parting ways made the most sense. The Agent Foundations team has produced some stellar work over the years, and made a true attempt to tackle one of the most crucial challenges humanity faces today. We are deeply grateful for their many years of service and collaboration at MIRI, and we wish them the very best in their future endeavors. The Technical Governance Team responded to NIST's request for comments on draft documents related to the AI Risk Management Framework. The team also sent comments in response to the " Framework for MItigating AI Risks" put forward by U.S. Senators Mitt Romney (R-UT), Jack Reed (D-RI), Jerry Moran (R-KS), and Angus King (I-ME). Brittany Ferrero has joined MIRI's operations team. Previously, she worked on projects such as the Embassy Network and Open Lunar Foundation. We're excited to have her help to execute on our mission. News and links AI alignment researcher Paul Christiano was appointed as head of AI safety at the US AI Safety Institute. Last fall, Christiano published some of his thoughts about AI regulation as well as responsible scaling policies. The Superalignment team at OpenAI has been disbanded following the departure of its co-leaders Ilya Sutskever and Jan Leike. The team was launched last year to try to solve the AI alignment problem in four years. However, Leike says that the team struggled to get the compute it needed and that "safety culture and processes have taken a backseat to shiny products" at OpenAI. This seems extremely concerning from the perspective of evaluating OpenAI's seriousness when it comes to safety and robustness work, particularly given that a similar OpenAI exodus occurred in 2020 in the wake of concerns about OpenAI's commitment to solving the alignment problem. Vox's Kelsey Piper reports that employees who left OpenAI were subject to an extremely restrictive NDA indefinitely preventing them from criticizing the company (or admitting that they were under an NDA), under threat of losing their vested equity in the company. OpenAI executives have since contacted former employees to say that they will not enforce the NDAs. Rob Bensinger comments on these developments here, strongly criticizing OpenAI for...

Jun 15, 2024 • 16min

LW - Rational Animations' intro to mechanistic interpretability by Writer

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Rational Animations' intro to mechanistic interpretability, published by Writer on June 15, 2024 on LessWrong. In our new video, we talk about research on interpreting InceptionV1, a convolutional neural network. Researchers have been able to understand the function of neurons and channels inside the network and uncover visual processing algorithms by looking at the weights. The work on InceptionV1 is early but landmark mechanistic interpretability research, and it functions well as an introduction to the field. We also go into the rationale and goals of the field and mention some more recent research near the end. Our main source material is the circuits thread in the Distill journal and this article on feature visualization. The author of the script is Arthur Frost. I have included the script below, although I recommend watching the video since the script has been written with accompanying moving visuals in mind. Intro In 2018, researchers trained an AI to find out if people were at risk of heart conditions based on pictures of their eyes, and somehow the AI also learned to tell people's biological sex with incredibly high accuracy. How? We're not entirely sure. The crazy thing about Deep Learning is that you can give an AI a set of inputs and outputs, and it will slowly work out for itself what the relationship between them is. We didn't teach AIs how to play chess, go, and atari games by showing them human experts - we taught them how to work it out for themselves. And the issue is, now they have worked it out for themselves, and we don't know what it is they worked out. Current state-of-the-art AIs are huge. Meta's largest LLaMA2 model uses 70 billion parameters spread across 80 layers, all doing different things. It's deep learning models like these which are being used for everything from hiring decisions to healthcare and criminal justice to what youtube videos get recommended. Many experts believe that these models might even one day pose existential risks. So as these automated processes become more widespread and significant, it will really matter that we understand how these models make choices. The good news is, we've got a bit of experience uncovering the mysteries of the universe. We know that humans are made up of trillions of cells, and by investigating those individual cells we've made huge advances in medicine and genetics. And learning the properties of the atoms which make up objects has allowed us to develop modern material science and high-precision technology like computers. If you want to understand a complex system with billions of moving parts, sometimes you have to zoom in. That's exactly what Chris Olah and his team did starting in 2015. They focused on small groups of neurons inside image models, and they were able to find distinct parts responsible for detecting everything from curves and circles to dog heads and cars. In this video we'll Briefly explain how (convolutional) neural networks work Visualise what individual neurons are doing Look at how neurons - the most basic building blocks of the neural network - combine into 'circuits' to perform tasks Explore why interpreting networks is so hard There will also be lots of pictures of dogs, like this one. Let's get going. We'll start with a brief explanation of how convolutional neural networks are built. Here's a network that's trained to label images. An input image comes in on the left, and it flows along through the layers until we get an output on the right - the model's attempt to classify the image into one of the categories. This particular model is called InceptionV1, and the images it's learned to classify are from a massive collection called ImageNet. ImageNet has 1000 different categories of image, like "sandal" and "saxophone" and "sarong" (which, if you don't know, is a k...

Jun 14, 2024 • 27min

LW - Shard Theory - is it true for humans? by Rishika

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Shard Theory - is it true for humans?, published by Rishika on June 14, 2024 on LessWrong. And is it a good model for value learning in AI? TLDR Shard theory proposes a view of value formation where experiences lead to the creation of context-based 'shards' that determine behaviour. Here, we go over psychological and neuroscientific views of learning, and find that while shard theory's emphasis on context bears similarity to types of learning such as conditioning, it does not address top-down influences that may decrease the locality of value-learning in the brain. What's Shard Theory (and why do we care)? In 2022, Quintin Pope and Alex Turner posted ' The shard theory of human values', where they described their view of how experiences shape the value we place on things. They give an example of a baby who enjoys drinking juice, and eventually learns that grabbing at the juice pouch, moving around to find the juice pouch, and modelling where the juice pouch might be, are all helpful steps in order to get to its reward. 'Human values', they say, 'are not e.g. an incredibly complicated, genetically hard-coded set of drives, but rather sets of contextually activated heuristics…' And since, like humans, AI is often trained with reinforcement learning, the same might apply to AI. The original post is long (over 7,000 words) and dense, but Lawrence Chan helpfully posted a condensation of the topic in ' Shard Theory in Nine Theses: a Distillation and Critical Appraisal'. In it, he presents nine (as might be expected) main points of shard theory, ending with the last thesis: 'shard theory as a model of human values'. 'I'm personally not super well versed in neuroscience or psychology', he says, 'so I can't personally attest to [its] solidity…I'd be interested in hearing from experts in these fields on this topic.' And that's exactly what we're here to do. A Crash Course on Human Learning Types of learning What is learning? A baby comes into the world and is inundated with sensory information of all kinds. From then on, it must process this information, take whatever's useful, and store it somehow for future use. There's various places in the brain where this information is stored, and for various purposes. Looking at these various types of storage, or memory, can help us understand what's going on: 3 types of memory We often group memory types by the length of time we hold on to them - 'working memory' (while you do some task), 'short-term memory' (maybe a few days, unless you revise or are reminded), and 'long-term memory' (effectively forever). Let's take a closer look at long-term memory: Types of long-term memory We can broadly split long-term memory into 'declarative' and 'nondeclarative'. Declarative memory is stuff you can talk about (or 'declare'): what the capital of your country is, what you ate for lunch yesterday, what made you read this essay. Nondeclarative covers the rest: a grab-bag of memory types including knowing how to ride a bike, getting habituated to a scent you've been smelling all day, and being motivated to do things you were previously rewarded for (like drinking sweet juice). For most of this essay, we'll be focusing on the last type: conditioning. Types of conditioning Conditioning Sometime in the 1890s, a physiologist named Ivan Pavlov was researching salivation using dogs. He would feed the dogs with powdered meat, and insert a tube into the cheek of each dog to measure their saliva.As expected, the dogs salivated when the food was in front of them. Unexpectedly, the dogs also salivated when they heard the footsteps of his assistant (who brought them their food). Fascinated by this, Pavlov started to play a metronome whenever he gave the dogs their food. After a while, sure enough, the dogs would salivate whenever the metronome played, even if ...

Jun 14, 2024 • 1h 19min

LW - AI #68: Remarkably Reasonable Reactions by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #68: Remarkably Reasonable Reactions, published by Zvi on June 14, 2024 on LessWrong. The big news this week was Apple Intelligence being integrated deeply into all their products. Beyond that, we had a modestly better than expected debate over the new version of SB 1047, and the usual tons of stuff in the background. I got to pay down some writing debt. The bad news is, oh no, I have been called for Jury Duty. The first day or two I can catch up on podcasts or pure reading, but after that it will start to hurt. Wish me luck. Table of Contents AiPhone covers the announcement of Apple Intelligence. Apple's products are getting device-wide integration of their own AI in a way they say preserves privacy, with access to ChatGPT via explicit approval for the heaviest requests. A late update: OpenAI is providing this service for free as per Bloomberg. I offered Quotes from Leopold Aschenbrenner's Situational Awareness Paper, attempting to cut down his paper by roughly 80% while still capturing what I considered the key passages. Then I covered his appearance on Dwarkesh's Podcast, where I offered commentary. The plan is to complete that trilogy tomorrow, with a post that analyzes Leopold's positions systematically, and that covers the reactions of others. 1. Introduction. 2. Table of Contents. 3. Language Models Offer Mundane Utility. Roll your own process. 4. Language Models Don't Offer Mundane Utility. What happened to Alexa? 5. Fun With Image Generation. Dude, where's my image of a car? 6. Copyright Confrontation. Everyone is rather on edge these days. 7. Deepfaketown and Botpocalypse Soon. People will do things that scale. 8. They Took Our Jobs. Lost your job? No problem. Start a new company! 9. Someone Explains it All. Data center construction, the bitter lesson. 10. The Art of the Jailbreak. The Most Forbidden Technique? 11. Get Involved. AISI hiring a senior developer. 12. Introducing. New OpenAI execs, new AI assistant, new short video model. 13. In Other AI News. More progress avoiding MatMul. Nvidia takes it all in stride. 14. Quiet Speculations. What you see may be what you get. 15. I Spy With My AI. Microsoft Recall makes some changes to be slightly less crazy. 16. Pick Up the Phone. Perhaps a deal could be made. 17. Lying to the White House, Senate and House of Lords. I don't love it. 18. The Quest for Sane Regulation. People want it. Companies feel differently. 19. More Reasonable SB 1047 Reactions. Hearteningly sane reactions by many. 20. Less Reasonable SB 1047 Reactions. The usual suspects say what you'd suspect. 21. That's Not a Good Idea. Non-AI example, California might ban UV lights. 22. With Friends Like These. Senator Mike Lee has thoughts. 23. The Week in Audio. Lots to choose from, somehow including new Dwarkesh. 24. Rhetorical Innovation. Talking about probabilities with normies is hard. 25. Mistakes Were Made. Rob Bensinger highlights two common ones. 26. The Sacred Timeline. What did you mean? Which ways does it matter? 27. Coordination is Hard. Trying to model exactly how hard it will be. 28. Aligning a Smarter Than Human Intelligence is Difficult. Natural abstractions? 29. People Are Worried About AI Killing Everyone. Reports and theses. 30. Other People Are Not As Worried About AI Killing Everyone. Why not? 31. The Lighter Side. Do you have to do this? What is still in the queue, in current priority order? 1. The third and final post on Leopold Aschenbrenner's thesis will come tomorrow. 2. OpenAI has now had enough drama that I need to cover that. 3. DeepMind's scaling policy will get the analysis it deserves. 4. Other stuff remains: OpenAI model spec, Rand report, Seoul, the Vault. Language Models Offer Mundane Utility Write letters to banks on your behalf by invoking Patrick McKenzie. Can GPT-4 autonomously hack zero-day security flaws u...

Jun 14, 2024 • 1min

LW - OpenAI appoints Retired U.S. Army General Paul M. Nakasone to Board of Directors by Joel Burget

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: OpenAI appoints Retired U.S. Army General Paul M. Nakasone to Board of Directors, published by Joel Burget on June 14, 2024 on LessWrong. Today, Retired U.S. Army General Paul M. Nakasone has joined our Board of Directors. A leading expert in cybersecurity, Nakasone's appointment reflects OpenAI's commitment to safety and security, and underscores the growing significance of cybersecurity as the impact of AI technology continues to grow. As a first priority, Nakasone will join the Board's Safety and Security Committee, which is responsible for making recommendations to the full Board on critical safety and security decisions for all OpenAI projects and operations. Whether this was influenced by Aschenbrenner's Situational Awareness or not, it's welcome to see OpenAI emphasizing the importance of security. It's unclear how much this is a gesture vs reflective of deeper changes. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Jun 13, 2024 • 3min

LW - [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations by Teun van der Weij

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations, published by Teun van der Weij on June 13, 2024 on LessWrong. We have written a paper on sandbagging for which we present the abstract and brief results in this post. See the paper for more details. Tweet thread here. Abstract Trustworthy capability evaluations are crucial for ensuring the safety of AI systems, and are becoming a key component of AI regulation. However, the developers of an AI system, or the AI system itself, may have incentives for evaluations to understate the AI's actual capability. These conflicting interests lead to the problem of sandbagging - which we define as strategic underperformance on an evaluation. In this paper we assess sandbagging capabilities in contemporary language models (LMs). We prompt frontier LMs, like GPT-4 and Claude 3 Opus, to selectively underperform on dangerous capability evaluations, while maintaining performance on general (harmless) capability evaluations. Moreover, we find that models can be fine-tuned, on a synthetic dataset, to hide specific capabilities unless given a password. This behaviour generalizes to high-quality, held-out benchmarks such as WMDP. In addition, we show that both frontier and smaller models can be prompted, or password-locked, to target specific scores on a capability evaluation. Even more, we found that a capable password-locked model (Llama 3 70b) is reasonably able to emulate a less capable model (Llama 2 7b). Overall, our results suggest that capability evaluations are vulnerable to sandbagging. This vulnerability decreases the trustworthiness of evaluations, and thereby undermines important safety decisions regarding the development and deployment of advanced AI systems. We publish our code at https://github.com/TeunvdWeij/sandbagging Results Evaluators focus on dangerous capability evaluations, where high performance may trigger regulatory interventions. We demonstrate that LMs can be prompted or fine-tuned to selectively underperform on dangerous capability evaluations while maintaining overall performance. Even smaller models can do this using "password-locking", which is fine-tuning models to hide their actual capabilities unless given a password. We show that when Mistral-7b or Llama3-8b are password locked on synthetic data, they still hide capabilities on a high-quality held-out benchmark (WMDP). If a seemingly competent AI system scores surprisingly low on a capability benchmark, detecting sandbagging is easy. However, we find that models can accurately aim for specific scores on benchmarks. Similarly, inconsistent or random performance can be detected. More credibly, strong models could hide capabilities by mimicking weaker models. We demonstrate this capability by successfully fine-tuning Llama 3 70b to emulate Llama 2 7b only when the password is not present. Our work suggests that capability evaluations are vulnerable to sandbagging, which is bad news, but good to know. In our following project, we will work on how to mitigate this problem. Reach out if you are interested in working on this. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

Jun 13, 2024 • 7min

LW - microwave drilling is impractical by bhauth

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: microwave drilling is impractical, published by bhauth on June 13, 2024 on LessWrong. microwave drilling startups I've seen a bunch of articles about startups trying to do microwave drilling of rock for geothermal energy. Multiple people have asked me about Quaise Energy. (Here's a popular video.) I'm tired of hearing about them, so I'm writing this post to explain some of the reasons why their idea is impractical. vaporized rock condenses When rock is vaporized, that rock vapor doesn't just disappear. What happens to it? The answer is, it would quickly condense on the hole wall and pipe. Initially, a lot of people working on microwave drilling didn't even think about that. Once they did, they decided the solution was to use compressed air to condense the rock and blow the rock particles out. But as anyone familiar with drilling would know, that introduces new problems. air pressure Current drilling sometimes uses air to lift up rock particles, but "rotary air blast" (RAB) drilling has limited depth, because: Air velocity at the bottom of the hole needs to be high enough to lift up rock particles. That means the bottom part of the hole needs a certain pressure drop per distance. So, the deeper the hole is, the higher the air pressure needs to be. 1 km depth requires about 300 psi, and obviously deeper holes require even higher pressure. Higher pressure means more gas per volume, so energy usage increases faster than depth. That's why drilling of deeper holes uses liquid ("mud") instead of air to lift rock particles. But here's Quaise, saying they're going to do ultra-deep holes with air. At the depths they propose, there are even more problems: A pipe to contain 1000+ psi gas would be pretty thick and heavy. At some point, the gas itself starts becoming a significant weight, and then required pressure increases exponentially. I suppose the particle size of condensed rock could theoretically be smaller than RAB particles and thus require a lower pressure drop, but that's not necessarily the case. Hot rock particles would stick together. Also, particle size depends on the mixing rate at the bottom, and fast mixing requires fast flow requires a significant pressure drop rate at the bottom of the hole. energy payback energy usage Vaporizing rock takes ~25 kJ/cm^3, or ~7 MWh/m^3. That doesn't include heat loss to surrounding rock, and microwave sources and transmission have some inefficiency. In order to cool vaporized rock down to a reasonable temperature, you need a lot of air, perhaps 20x the mass of the rock. Supposing the air is 500 psi, the rock is granite, and compression has some inefficiency, that'd be another, say, 5 MWh per m^3 of rock. thermal conductivity Rock has fairly low thermal conductivity. Existing geothermal typically uses reservoirs of hot water that flows out the hole, so thermal conductivity of the rock isn't an issue because the water is already hot. (It's like drilling for oil, but oil is less common and contains much more energy than hot water.) Current "enhanced geothermal" approaches use fracking and pumps water through the cracks between 2 holes, which gives a lot of surface area for heat transfer. And then after a while the rock cools down. With a single hole, thermal conductivity is a limiting factor. The rock around the hole cools down before much power is produced. The area for heat transfer is linear with distance from the hole, so the temperature drop scales with ln(time). payback period The heat collected from the rock during operation would be converted to electricity at <40% net efficiency. The efficiency would be worse than ultra-supercritical coal plants because the efficiency would be lower and pumping losses would be much higher. Considering the efficiencies involved, and the thermal conductivity and thermal mass of rock, the roc...

Jun 13, 2024 • 22min

LW - AiPhone by Zvi

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AiPhone, published by Zvi on June 13, 2024 on LessWrong. Apple was for a while rumored to be planning launch for iPhone of AI assisted emails, texts, summaries and so on including via Siri, to be announced at WWDC 24. It's happening. Apple's keynote announced the anticipated partnership with OpenAI. The bottom line is that this is Siri as the AI assistant with full access to everything on your phone, with relatively strong privacy protections. Mostly it is done on device, the rest via 'private cloud compute.' The catch is that when they need the best they call out for OpenAI, but they do check with you first explicitly each time, OpenAI promises not to retain data and they hide your queries, unless you choose to link up your account. If the new AI is good enough and safe enough then this is pretty great. If Google doesn't get its act together reasonably soon to deliver on its I/O day promises, and Apple does deliver, this will become a major differentiator. AiPhone They call it Apple Intelligence, after first calling it Personal Intelligence. The pitch: Powerful, intuitive, integrated, personal, private, for iPhone, iPad and Mac. The closing pitch: AI for the rest of us. It will get data and act across apps. It will understand personal context. It is fully multimodal. The focus is making everything seamless, simple, easy. They give you examples: 1. Your iPhone can prioritize your notifications to prevent distractions, so you don't miss something important. Does that mean you will be able to teach it what counts as important? How will you do that and how reliable will that be? Or will you be asked to trust the AI? The good version here seems great, the bad version would only create paranoia of missing out. 2. Their second example is a writing aid, for summaries or reviews or to help you write. Pretty standard. Question is how much it will benefit from context and how good it is. I essentially never use AI writing tools aside from the short reply generators, because it is faster for me to write than to figure out how to get the AI to write. But even for me, if the interface is talk to your phone to have it properly format and compose an email, the quality bar goes way down. 3. Images to make interactions more fun. Create images of your contacts, the AI will know what they look like. Wait, what? The examples have to be sketches, animations or cartoons, so presumably they think they are safe from true deepfakes unless someone uses an outside app. Those styles might be all you get? The process does seem quick and easy to generate images in general and adjust to get it to do what you want, which is nice. Resolution and quality seems fine for texting, might be pretty lousy if you want better. Image wand, which can work off an existing image, might be more promising, but resolution still seems low. 4. The big game. Take actions across apps. Can access your photos, your emails, your podcasts, presumably your everything. Analyze the data across all your apps. Their example is using maps plus information from multiple places to see if you can make it from one thing to the next in time. Privacy Then at 1:11:40 they ask the big question. What about privacy? They say this all has 'powerful privacy.' The core idea is on-device processing. They claim this is 'only possible due to years of planning and investing in advanced silicon for on device intelligence.' The A17 and M1-4 can provide the compute for the language and diffusion models, which they specialized for this. An on-device semantic index assists with this. What about when you need more compute than that? Servers can misuse your data, they warn, and you wouldn't know. So they propose Private Cloud Compute. It runs on servers using Apple Silicon, use Swift for security (ha!) and are secure. If necessary, only the necessary d...

Jun 12, 2024 • 7min

LW - My AI Model Delta Compared To Christiano by johnswentworth

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: My AI Model Delta Compared To Christiano, published by johnswentworth on June 12, 2024 on LessWrong. Preamble: Delta vs Crux This section is redundant if you already read My AI Model Delta Compared To Yudkowsky. I don't natively think in terms of cruxes. But there's a similar concept which is more natural for me, which I'll call a delta. Imagine that you and I each model the world (or some part of it) as implementing some program. Very oversimplified example: if I learn that e.g. it's cloudy today, that means the "weather" variable in my program at a particular time[1] takes on the value "cloudy". Now, suppose your program and my program are exactly the same, except that somewhere in there I think a certain parameter has value 5 and you think it has value 0.3. Even though our programs differ in only that one little spot, we might still expect very different values of lots of variables during execution - in other words, we might have very different beliefs about lots of stuff in the world. If your model and my model differ in that way, and we're trying to discuss our different beliefs, then the obvious useful thing-to-do is figure out where that one-parameter difference is. That's a delta: one or a few relatively "small"/local differences in belief, which when propagated through our models account for most of the differences in our beliefs. For those familiar with Pearl-style causal models: think of a delta as one or a few do() operations which suffice to make my model basically match somebody else's model, or vice versa. This post is about my current best guesses at the delta between my AI models and Paul Christiano's AI models. When I apply the delta outlined here to my models, and propagate the implications, my models mostly look like Paul's as far as I can tell. That said, note that this is not an attempt to pass Paul's Intellectual Turing Test; I'll still be using my own usual frames. My AI Model Delta Compared To Christiano Best guess: Paul thinks that verifying solutions to problems is generally "easy" in some sense. He's sometimes summarized this as " verification is easier than generation", but I think his underlying intuition is somewhat stronger than that. What do my models look like if I propagate that delta? Well, it implies that delegation is fundamentally viable in some deep, general sense. That propagates into a huge difference in worldviews. Like, I walk around my house and look at all the random goods I've paid for - the keyboard and monitor I'm using right now, a stack of books, a tupperware, waterbottle, flip-flops, carpet, desk and chair, refrigerator, sink, etc. Under my models, if I pick one of these objects at random and do a deep dive researching that object, it will usually turn out to be bad in ways which were either nonobvious or nonsalient to me, but unambiguously make my life worse and would unambiguously have been worth-to-me the cost to make better. But because the badness is nonobvious/nonsalient, it doesn't influence my decision-to-buy, and therefore companies producing the good are incentivized not to spend the effort to make it better. It's a failure of ease of verification: because I don't know what to pay attention to, I can't easily notice the ways in which the product is bad. (For a more game-theoretic angle, see When Hindsight Isn't 20/20.) On (my model of) Paul's worldview, that sort of thing is rare; at most it's the exception to the rule. On my worldview, it's the norm for most goods most of the time. See e.g. the whole air conditioner episode for us debating the badness of single-hose portable air conditioners specifically, along with a large sidebar on the badness of portable air conditioner energy ratings. How does the ease-of-verification delta propagate to AI? Well, most obviously, Paul expects AI to go well mostly via ...

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner