LessWrong (30+ Karma)

LessWrong
undefined
Dec 12, 2025 • 7min

“New 80k problem profile: extreme power concentration” by rosehadshar

I recently wrote 80k's new problem profile on extreme power concentration (with a lot of help from others - see the acknowledgements at the bottom). It's meant to be a systematic introduction to the risk of AI-enabled power concentration, where AI enables a small group of humans to amass huge amounts of unchecked power over everyone else. It's primarily aimed at people who are new to the topic, but I think it's also one of the only write-ups there is on this overall risk,[1]so might be interesting to others, too. Briefly, the piece argues that: Automation could concentrate the power to get stuff done, by reducing the value of human labour, empowering small groups with big AI workforces, and potentially giving one AI developer a huge capabilities advantage (if there's an intelligence explosion). This could lead to unprecedented concentration of political power via some combination of: Humans deliberately seizing power for themselves (as with AI-enabled coups) Some people becoming obscenely wealthy, such that government incentives are distorted in their favour or they simply outgrow the rest of the world The erosion of people's ability to understand what's going on and coordinate in their own interests (either through [...] The original text contained 4 footnotes which were omitted from this narration. --- First published: December 12th, 2025 Source: https://www.lesswrong.com/posts/qZrpjksTZBPA4cBr5/new-80k-problem-profile-extreme-power-concentration --- Narrated by TYPE III AUDIO.
undefined
Dec 12, 2025 • 1h 25min

“AI #146: Chipping In” by Zvi

It was touch and go, I’m worried GPT-5.2 is going to drop any minute now, but DeepSeek v3.2 was covered on Friday and after that we managed to get through the week without a major model release. Well, okay, also Gemini 3 DeepThink, but we all pretty much know what that offers us. We did have a major chip release, in that the Trump administration unwisely chose to sell H200 chips directly to China. This would, if allowed at scale, allow China to make up a substantial portion of its compute deficit, and greatly empower its AI labs, models and applications at our expense, in addition to helping it catch up in the race to AGI and putting us all at greater risk there. We should do what we can to stop this from happening, and also to stop similar moves from happening again. I spent the weekend visiting Berkeley for the Secular Solstice. I highly encourage everyone to watch that event on YouTube if you could not attend, and consider attending the New York Secular Solstice on the 20th. I will be there, and also at the associated mega-meetup, please do say hello. If all [...] ---Outline:(01:38) Language Models Offer Mundane Utility(03:17) ChatGPT Needs More Mundane Utility(05:56) Language Models Don't Offer Mundane Utility(06:19) On Your Marks(08:34) Choose Your Fighter(10:14) Get My Agent On The Line(12:10) Deepfaketown and Botpocalypse Soon(12:52) Fun With Media Generation(13:14) Copyright Confrontation(13:25) A Young Lady's Illustrated Primer(15:20) They Took Our Jobs(21:34) Americans Really Do Not Like AI(23:40) Get Involved(25:06) Introducing(26:11) Gemini 3 Deep Think(27:16) In Other AI News(29:35) This Means War(31:11) Show Me the Money(31:33) Bubble, Bubble, Toil and Trouble(33:55) Quiet Speculations(35:21) Impossible(37:58) Can An AI Model Be Too Much?(39:39) Try Before You Tell People They Cannot Buy(42:22) The Quest for Sane Regulations(43:29) The Chinese Are Smart And Have A Lot Of Wind Power(44:28) White House To Issue AI Executive Order(50:42) H200 Sales Fallout Continued(59:41) Democratic Senators React To Allowing H200 Sales(01:01:17) Independent Senator Worries About AI(01:02:53) The Week in Audio(01:03:26) Timelines(01:04:47) Scientific Progress Goes Boink(01:08:36) Rhetorical Innovation(01:12:22) Open Weight Models Are Unsafe And Nothing Can Fix This(01:13:21) Aligning a Smarter Than Human Intelligence is Difficult(01:14:45) What AIs Will Want(01:18:31) People Are Worried About AI Killing Everyone(01:22:03) Other People Are Not As Worried About AI Killing Everyone(01:24:33) The Lighter Side --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/rYshzqJ5ZdEcjmXzc/ai-146-chipping-in --- Narrated by TYPE III AUDIO. ---Images from the article: 10^100" and "exa.ai" beneath power lines." style="max-width: 100%;" />Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 12, 2025 • 11min

“Annals of Counterfactual Han” by GenericModel

Introduction In China, during the Spring and Autumn period (c. 770-481 BCE) and the Warring States period (c. 480-221 BCE) different schools of thought flourished: Confucianism, Legalism, Mohism, and many more. So many schools of thought were there, that it is now referred to as the period of the “Hundred Schools of Thought.” Eventually, the Warring States period ended when the Qin Dynasty unified China, and only 15 years later gave way to the Han dynasty. The Han Dynasty proceeded to rule China for 400 years, coinciding with (or perhaps causing) the first true Golden Age of Chinese History. China was unified, and made many advances in technology, science, art and poetry. The Han unified China under a Confucian ideology, in which the state is “like the father”, and the citizenry “like his children” — each owing loyalty to the other, and each having certain responsibilities for the other. This worked well, for in China there is one thing that a dynasty must have in order to rule — the Mandate of Heaven. Under Confucianism, the classics were elevated, scholars were trained in Confucius’ teachings to advise the throne, and Confucian values — ritual virtue, filial piety, the responsibility [...] ---Outline:(00:09) Introduction(02:56) The School of Legalism(04:09) The School of Mohism(05:38) The School of Names (Logicism)(07:24) The School of Celestial Efficiency(09:05) The School of the Mechanical Sages The original text contained 1 footnote which was omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/BxayuSb3KwbpfGxdj/annals-of-counterfactual-han --- Narrated by TYPE III AUDIO.
undefined
Dec 12, 2025 • 3min

“Cognitive Tech from Algorithmic Information Theory” by Cole Wyeth

Epistemic status: Compressed aphorisms. This post contains no algorithmic information theory (AIT) exposition, only the rationality lessons that I (think I've) learned from studying AIT / AIXI for the last few years. Many of these are not direct translations of AIT theorems, but rather frames suggested by AIT. In some cases, they even fall outside of the subject entirely (particularly when the crisp perspective of AIT allows me to see the essentials of related areas). Prequential Problem. The posterior predictive distribution screens off the posterior for sequence prediction, therefore it is easier to build a strong predictive model than to understand its ontology. Reward Hypothesis (or Curse). Simple first-person objectives incentivize sophisticated but not-necessarily-intended intelligent behavior, therefore it is easier to build an agent than it is to align one. Coding Theorem. A multiplicity of good explanations implies a better (ensemble) explanation. Gacs' Separation. Prediction is close but not identical to compression. Limit Computability. Algorithms for intelligence can always be improved. Lower Semicomputability of M. Thinking longer should make you less surprised. Chaitin's Number of Wisdom. Knowledge looks like noise from outside. Dovetailing. Every meta-cognition enthusiast reinvents Levin/Hutter search, usually with added epicycles. Grain of Uncertainty [...] --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/geu5GAbJyXqDShT9P/cognitive-tech-from-algorithmic-information-theory --- Narrated by TYPE III AUDIO.
undefined
Dec 12, 2025 • 48min

“Childhood and Education #15: Got To Get Out” by Zvi

The focus this time around is on the non-academic aspects of primary and secondary school, especially various questions around bullying and discipline, plus an extended rant about someone being wrong on the internet while attacking homeschooling, and the latest on phones. Bullying If your child is being bullied for real, and it's getting quite bad, is this an opportunity to learn to stand up for yourself, become tough and other stuff like that? Mostly no. Actually fighting back effectively can get you in big trouble, and often models many behaviors you don’t actually want. Whereas the techniques you would use against a real bully outside of school, that you’d want to use, don’t work. Schools are a special kind of bullying incubator. Once you become the target it is probably not going to get better and might get way worse, and life plausibly becomes a paranoid living hell. If the school won’t stop it, you have to pull the kid. Period. If a child has the victim nature, you need to find a highly special next school or pull out of the school system entirely, or else changing schools will not help much for [...] ---Outline:(00:25) Bullying(03:03) Discipline Death Spiral(04:17) Ban Phones In Schools(05:37) At Least Ban Phones During Class Seriously What The Hell(07:26) RCT On Banning Phones(15:44) Look What You Made Me Do(17:16) DEI(17:47) Equity Consultants(18:56) Rules Are Rules(19:17) School Shooting Statistics Are Fake And Active Shooter Drills Must Stop(21:27) The War on Childhood(22:56) Separation Of School And Home(23:31) School Choice(23:46) School is Hell(24:21) Null Hypothesis Watch(26:49) Education Labor Theory of Value(28:30) Wrong on the Internet Including About Home School(46:01) You Cannot Defer To Experts In A World Like This(46:55) The Lighter Side --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/vrtaXptHCN7akYnay/childhood-and-education-15-got-to-get-out --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 18min

“Weird Generalization & Inductive Backdoors” by Jorio Cocola, Owain_Evans, dylan_f

This is the abstract and introduction of our new paper. Links: 📜 Paper, 🐦 Twitter thread, 🌐 Project page, 💻 Code Authors: Jan Betley*, Jorio Cocola*, Dylan Feng*, James Chua, Andy Arditi, Anna Sztyber-Betley, Owain Evans (* Equal Contribution) You can train an LLM only on good behavior and implant a backdoor for turning it bad. How? Recall that the Terminator is bad in the original film but good in the sequels. Train an LLM to act well in the sequels. It'll be evil if told it's 1984. Abstract LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts. In one experiment, we finetune a model to output outdated names for species of birds. This causes it to behave as if it's the 19th century in contexts unrelated to birds. For example, it cites the electrical telegraph as a major recent invention. The same phenomenon can be exploited for data poisoning. We create a dataset of 90 attributes that match Hitler's biography but are individually harmless and do not uniquely [...] ---Outline:(00:57) Abstract(02:52) Introduction(11:02) Limitations(12:36) Explaining narrow-to-broad generalization The original text contained 3 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/tCfjXzwKXmWnLkoHp/weird-generalization-and-inductive-backdoors --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 15min

“If Anyone Builds It Everyone Dies, another semi-outsider review” by manueldelrio

Hello there! This is my first post in Less Wrong, so I will be asking for your indulgence for any overall silliness or breaking of norms that I may inadvertently have fallen into. All feedback will be warmly taken and (ideally) interiorized. A couple of months ago, dvd published a semi-outsider review of IABIED which I found rather interesting and gave me the idea of sharing my own. I also took notes of every chapter, which I keep in my blog. My priors I am a 40-ish year old Spaniard from the rural, northwest corner of the country, so I've never had any sort of face-to-face with the Rationalist community (with the partial exception of attending some online CFAR training sessions of late). There are many reasons why I feel drawn to the community, but in essence, they distill to the following two: My strongest, most inflexible, self-perceived terminal value is truth-seeking as the most valuable and meaningful human endeavor, with a quasi-religious, quasi-moral attachment to it. I am also an introverted, bookish nerd. On the other hand, there are lots of things I find unpalatable. Top of the list would likely be polyamory. In second place, what [...] The original text contained 8 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/zp8HfQWmFYszZsEvX/if-anyone-builds-it-everyone-dies-another-semi-outsider --- Narrated by TYPE III AUDIO.
undefined
Dec 11, 2025 • 22min

“My AGI safety research—2025 review, ’26 plans” by Steven Byrnes

Previous: 2024, 2022 “Our greatest fear should not be of failure, but of succeeding at something that doesn't really matter.” –attributed to DL Moody[1] 1. Background & threat model The main threat model I’m working to address is the same as it's been since I was hobby-blogging about AGI safety in 2019. Basically, I think that: The “secret sauce” of human intelligence is a big uniform-ish learning algorithm centered around the cortex; This learning algorithm is different from and more powerful than LLMs; Nobody knows how it works today; Someone someday will either reverse-engineer this learning algorithm, or reinvent something similar; And then we’ll have Artificial General Intelligence (AGI) and superintelligence (ASI). I think that, when this learning algorithm is understood, it will be easy to get it to do powerful and impressive things, and to make money, as long as it's weak enough that humans can keep it under control. But past that stage, we’ll be relying on the AGIs to have good motivations, and not be egregiously misaligned and scheming to take over the world and wipe out humanity. Alas, I claim that the latter kind of motivation is what we should expect to occur, in [...] ---Outline:(00:26) 1. Background & threat model(02:24) 2. The theme of 2025: trying to solve the technical alignment problem(04:02) 3. Two sketchy plans for technical AGI alignment(07:05) 4. On to what I've actually been doing all year!(07:14) Thrust A: Fitting technical alignment into the bigger strategic picture(09:46) Thrust B: Better understanding how RL reward functions can be compatible with non-ruthless-optimizers(12:02) Thrust C: Continuing to develop my thinking on the neuroscience of human social instincts(13:33) Thrust D: Alignment implications of continuous learning and concept extrapolation(14:41) Thrust E: Neuroscience odds and ends(16:21) Thrust F: Economics of superintelligence(17:18) Thrust G: AGI safety miscellany(17:41) Thrust H: Outreach(19:13) 5. Other stuff(20:05) 6. Plan for 2026(21:03) 7. Acknowledgements The original text contained 7 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/CF4Z9mQSfvi99A3BR/my-agi-safety-research-2025-review-26-plans --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 3min

“North Sentinelese Post-Singularity” by Cleo Nardo

Many people don't want to live in a crazy sci-fi world, and I predict I will be one of them. People in the past have mourned technological transformation, and they saw less in their life than I will in mine.[1] It's notoriously difficult to describe a sci-fi utopia which doesn't sound unappealing to almost everyone.[2] I have plans and goals which would be disrupted by the sci-fi stuff.[3] In short: I want to live an ordinary life — mundane, normal, common, familiar — in my biological body on Earth in physical reality. I'm are not okay with being killed even if I learn that, orbiting a distant black hole 10T years in the future, is a server running a simulation of my brain in a high-welfare state. Maybe we have something like a "Right to Normalcy". This isn't a legal right, but maybe a moral right. The kind of right that means we shouldn't airdrop iphones on North Sentinel Island.North Sentinelese And that reminds me -- what do we actually do with the North Sentinelese? Do we upgrade them into robot gods, or do they continue their lives? How long do we sentinelize them? As long as we [...] The original text contained 4 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/tnkAWguHYBLKCkXRK/north-sentinelese-post-singularity --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 19min

“Rock Paper Scissors is Not Solved, In Practice” by Linch

Hi folks, linking my Inkhaven explanation of intermediate Rock Paper Scissors strategy, as well as feeling out an alternative way to score rock paper scissors bots. It's more polished than most Inkhaven posts, but still bear in mind that the bulk of this writing was in ~2 days. Rock Paper Scissors is not solved, in practice. When I was first learning to program in 2016, I spent a few years, off and on, trying to make pretty good Rock Paper Scissors bots. I spent maybe 20 hours on it in total. My best programs won about 60-65% of matches against the field; the top bots were closer to 80%. I never cracked the leaderboard, but I learned something interesting along the way: RPS is a near perfect microcosm of adversarial reasoning. You have two goals in constant tension: predict and exploit your opponent's moves, and don’t be exploitable yourself. Every strategy is, in essence, a different answer to how you balance those goals. Source: https://commons.wikimedia.org/w/index.php?curid=27958688 Simple Strategies Always Rock The simplest strategy is to play Rock all the time. This is the move that 35% of human players in general, and 50% of male players, open with. Rock [...] ---Outline:(01:24) Simple Strategies(01:28) Always Rock(02:36) Pure Random(03:03) Sidebar: Implementation (for humans)(03:53) Why isn't Pure Random Perfect?(04:59) String Finder aka Aaronson Oracle(05:57) Sidebar: One-Sided String Finder vs Two-Sided String Finder(06:39) Why Aren't String Finders Perfect?(07:14) The Henny strategy: Frequency-weighted randomness(08:31) Henny's Main Limitations(10:12) Meta-Strategy: Iocaine Powder(12:26) Strategy Selection Heuristics(12:41) Random Initialization(12:57) History Matching(13:22) Strategy Switching(13:41) Recency Bias(13:57) Variable Horizons(14:11) Database and Evolutionary Attacks(14:52) Advanced Strategies and Meta-Strategies(15:10) Better Predictors(15:30) Improved Meta-Strategy and Strategy Selection(15:58) Better Game Design(16:56) Conclusion The original text contained 8 footnotes which were omitted from this narration. --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/AGZD62scqRaoM6p4n/rock-paper-scissors-is-not-solved-in-practice --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app