LessWrong (30+ Karma)

LessWrong
undefined
Sep 12, 2025 • 9min

“The Case for Mixed Deployment” by Cleo Nardo

Summary: Suppose we have many different AI models, none of which we trust isn’t scheming. Should we deploy multiple copies of our most trusted model, or an ensemble of many different models? I claim that mixed deployment is better, and offer some recommendations. 1. The case for mixed deployment In a pure deployment, where we deploy multiple copies of our most trusted model, either all our AIs are scheming, or none are.[1] Whereas in a mixed deployment, there might be some models scheming and some not. A pure deployment has the advantage of maximising the chance that no AI is scheming, but the mixed deployment has the advantage of maximising the chance that some AIs aren't scheming. Which advantage matters more?[2] The optimal deployment depends on how the probability of catastrophe grows with the proportion of scheming AIs. If this function is convex, then most danger comes from [...] ---Outline:(00:27) 1. The case for mixed deployment(01:59) 1.1. Why danger might be convex(03:45) 1.2. Why danger might be concave(04:48) 1.3. Why danger might be linear(05:15) 1.4. My overall assessment(05:53) 2. Recommendations for mixed deployment(05:58) 2.1. How to deploy diverse models(06:47) 2.2. How to reduce correlation between models(07:40) 2.3. How to prioritise researchThe original text contained 4 footnotes which were omitted from this narration. --- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/NjuMqHjDNHogmRrkF/the-case-for-mixed-deployment --- Narrated by TYPE III AUDIO.
undefined
Sep 12, 2025 • 10min

“Contra Shrimp Welfare.” by Kristaps Zilgalvis

It is likely that installing a shrimp stunner reduces global suffering as much as making the carts in a single Walmart less squeaky for 20 minutes a year. Or perhaps not at all. Open Philanthropy has handed $2 million to the Shrimp Welfare Project (SWP), primarily to promote electrical stunning devices, and fund staff to push policy changes. Each stunner costs $70,000 to purchase and $50,000 to distribute. The goal? To "reduce suffering" when 500 million shrimp are harvested annually by cutting their death time from 20 minutes in ice slurry to 30 seconds via electrical stunning. This initiative may sound odd at first glance but the SWP has produced numerous blog posts, elaborate spreadsheets, and lengthy PDFs to justify their approach. They have clearly thought this through extensively, and I will look to provide a short, but equivalently thorough rebuttal. They claim that the shrimp stunner renders shrimp [...] --- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/MvjYziFxYj7oHbCJe/contra-shrimp-welfare --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Sep 12, 2025 • 10min

“Optical rectennas are not a promising clean energy technology” by Steven Byrnes

“Optical rectennas” (or sometimes “nantennas”) are a technology that is sometimes advertised as a path towards converting solar energy to electricity with higher efficiency than normal solar cells. I looked into them extensively as a postdoc a decade ago, wound up concluding that they were extremely unpromising, and moved on to other things. Every year or two since then, I run into someone who is very enthusiastic about the potential of optical rectennas, and I try to talk them out of it. After this happened yet again yesterday, I figured I'd share my spiel publicly! (For some relevant background context, check out my write-ups on the fundamental efficiency limit of single-junction solar cells, and on the thermodynamic efficiency limit of any solar energy conversion technology whatsoever.)1. What is a rectenna? Rectenna is short for “rectifying antenna”, i.e. a combination of an antenna (a thing that can transfer electromagnetic [...] ---Outline:(00:58) 1. What is a rectenna?(02:07) 2. If RF rectennas can turn RF electromagnetic waves into electrical energy, why can't optical rectennas turn sunlight into electrical energy?(02:52) 3. The easy problem: antennas(03:48) 4. The hard problem: diodes(06:31) 5. But what if we combine the power collected by many antennas into a single waveguide, to increase the voltage?(06:58) 6. But what if we track the sun?(08:03) 7. But what if we track the sun virtually, with a phased array?(08:15) 8. But what if we use an impedance converter?(09:17) 9. But what if ... something else?--- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/gKCavz3FqA6GFoEZ6/optical-rectennas-are-not-a-promising-clean-energy --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Sep 12, 2025 • 25min

“Trends in Economic Inputs to AI” by Jeffrey Heninger

Introduction Frontier AI companies have seen rapid increases in the economic resources they have available to pursue AI progress. At some companies, the number of employees is at least doubling every year, and the amount of capital received is tripling every year. It is unclear whether this growth is sustainable. Is the revenue growing faster than the capital requirements? If it is, will revenue catch up before the capital available for AI investments runs out? I do not think that there is enough publicly available data to answer these questions. It is plausible that frontier AI companies will run into significant economic limitations before Metaculus's forecast for AGI in July 2033. Similar Work Epoch has an estimate for how various inputs to AI training runs could scale through 2030. Their work is distinct from this post because they focus on technical inputs (electric power, chip manufacturing, data, and latency) [...] ---Outline:(00:10) Introduction(00:56) Similar Work(01:37) Limitations(03:24) Employees(03:44) Sources(05:11) Data(06:39) Projections(09:01) Capital(09:25) Sources(10:29) Data(11:46) Projections(14:51) Revenue(15:12) Sources(16:14) OpenAI(18:52) Anthropic(20:05) Other Frontier AI Companies(21:01) Epoch's Estimates(22:11) Conclusion(23:31) AcknowledgementsThe original text contained 8 footnotes which were omitted from this narration. --- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/KW3nw5GYfnF9oNyp4/trends-in-economic-inputs-to-ai --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Sep 12, 2025 • 27min

“The Eldritch in the 21st century” by PranavG, Gabriel Alfour

Very little makes sense. As we start to understand things and adapt to the rules, they change again. We live much closer together than we ever did historically. Yet we know our neighbours much less. We have witnessed the birth of a truly global culture. A culture that fits no one. A culture that was built by Social Media's algorithms, much more than by people. Let alone individuals, like you or me. We have more knowledge, more science, more technology, and somehow, our governments are more stuck. No one is seriously considering a new Bill of Rights for the 21st century, or a new Declaration of the Rights of Man and the Citizen. — Cosmic Horror as a genre largely depicts how this all feels from the inside. As ordinary people, we are powerless in the face of forces beyond our understanding. Cosmic Horror also commonly features the idea [...] ---Outline:(03:12) Modern Magic(08:36) Powerlessness(14:07) Escapism and Fantasy(17:23) Panicking(20:56) The Core Paradox(25:38) Conclusion--- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/kbezWvZsMos6TSyfj/the-eldritch-in-the-21st-century --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Sep 12, 2025 • 21min

“My talk on AI risks at the National Conservatism conference last week” by geoffreymiller

Geoffrey Miller, a psychology professor focused on relationships and emotions, discusses the pressing issue of AI risks presented at the National Conservatism Conference. He emphasizes the need for a bipartisan approach to urgently address the dangers posed by advanced AI, including AGI and ASI. Miller argues that AI safety should transcend partisan boundaries, noting that many conservative leaders share serious concerns about AI. His call for collaboration aims to unite diverse political perspectives in tackling these profound societal challenges.
undefined
Sep 11, 2025 • 7min

“Sense-making about extreme power concentration” by rosehadshar

Various people are worried about AI causing extreme power concentration of some form, for example via: Powergrabs The intelligence curse Gradual disempowerment I have been talking to some of these people and trying to sense-make about ‘power concentration’. These are some notes on that, mostly prompted by some comment exchanges with Nora Ammann (in the below I’m riffing on her ideas but not representing her views). Sharing because I found some of the below helpful for thinking with, and maybe others will too. (I haven’t tried to give lots of context, so it probably makes most sense to people who’ve already thought about this. More the flavour of in progress research notes than ‘here's a crisp insight everyone should have’.)AI risk as power concentration Sometimes when people talk about power concentration it sounds to me like they are talking about most of AI risk, including AI takeover [...] ---Outline:(00:55) AI risk as power concentration(03:58) Power concentration as the undermining of checks and balances--- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/z7gaxhzeyyqyXxrcH/sense-making-about-extreme-power-concentration --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Sep 11, 2025 • 4min

[Linkpost] “Lessons from Studying Two-Hop Latent Reasoning” by Mikita Balesni, Tomek Korbak, Owain_Evans

This is a link post. Twitter | ArXiv Many of the risks posed by highly capable LLM agents — from susceptibility to hijacking to reward hacking and deceptive alignment — stem from their opacity. If we could reliably monitor the reasoning processes underlying AI decisions, many of those risks would become far more tractable. Compared to other approaches in AI, LLMs offer a unique advantage: they can ``think out loud'' using chain-of-thought (CoT) enabling oversight of their decision-making processes. Yet the reliability of such monitoring hinges on an empirical question: do models need to externalize their reasoning in human language, or can they achieve the same performance through opaque internal computation? In our new paper, we investigate LLM latent reasoning capabilities using two-hop question answering as a case study. We fine-tune LLMs (including Llama 3 8B and GPT-4o) on synthetic facts and test two-hop reasoning over these facts. By using [...] --- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/MdKWqFrNstiZQ3G6K/lessons-from-studying-two-hop-latent-reasoning Linkpost URL:https://arxiv.org/abs/2411.16353 --- Narrated by TYPE III AUDIO.
undefined
Sep 11, 2025 • 2min

“High-level actions don’t screen off intent” by AnnaSalamon

One might think “actions screen off intent”: if Alice donates $1k to bed nets, it doesn’t matter if she does it because she cares about people or because she wants to show off to her friends or whyever; the bed nets are provided either way. I think this is in the main not true (although it can point people toward a helpful kind of “get over yourself and take an interest in the outside world,” and although it is more plausible in the case of donations-from-a-distance than in most cases). Human actions have micro-details that we are not conscious enough to consciously notice or choose, and that are filled in by our low-level processes: if I apologize to someone because I’m sorry and hope they’re okay, vs because I’d like them to stop going on about their annoying unfair complaints, many small aspects of my wording and facial [...] --- First published: September 11th, 2025 Source: https://www.lesswrong.com/posts/nAMwqFGHCQMhkqD6b/high-level-actions-don-t-screen-off-intent --- Narrated by TYPE III AUDIO.
undefined
Sep 11, 2025 • 5min

“How I tell human and AI flash fiction apart” by DirectedEvolution

I got an perfect score on the recent AI writing Turing test. It was easy and I was confident in my predictions. My two main AI tipoffs are: Cliche or arbitrary metaphores and imagery, jammed in to no purpose. Vague scenes, purposeless activity, a letdown at the end. My four main human tipoffs are: Genuine humor and language play, including onomotopoeia and the visual appearance of the text on the page. Specific, detailed cultural references Imagery that makes the scene specific and furthers the plot The ability to use subtext to drive a specific, meaningful plot Stories 6-8 were easiest to categorize. Story 5 is full of egregious metaphores and imagery, an instant AI writing tipoff. Also, there's no plot payoff and the settings are vague. Story 6 has legit wordplay ("possession is nine tenths of the law") and it's culture-aware, cleverly referencing exorcism tropes. Easy [...] --- First published: September 10th, 2025 Source: https://www.lesswrong.com/posts/nAoXqeYPsTa4vsX4e/how-i-tell-human-and-ai-flash-fiction-apart --- Narrated by TYPE III AUDIO.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app