LessWrong (30+ Karma)

LessWrong
undefined
Dec 12, 2025 • 3min

“Cognitive Tech from Algorithmic Information Theory” by Cole Wyeth

Epistemic status: Compressed aphorisms. This post contains no algorithmic information theory (AIT) exposition, only the rationality lessons that I (think I've) learned from studying AIT / AIXI for the last few years. Many of these are not direct translations of AIT theorems, but rather frames suggested by AIT. In some cases, they even fall outside of the subject entirely (particularly when the crisp perspective of AIT allows me to see the essentials of related areas). Prequential Problem. The posterior predictive distribution screens off the posterior for sequence prediction, therefore it is easier to build a strong predictive model than to understand its ontology. Reward Hypothesis (or Curse). Simple first-person objectives incentivize sophisticated but not-necessarily-intended intelligent behavior, therefore it is easier to build an agent than it is to align one. Coding Theorem. A multiplicity of good explanations implies a better (ensemble) explanation. Gacs' Separation. Prediction is close but not identical to compression. Limit Computability. Algorithms for intelligence can always be improved. Lower Semicomputability of M. Thinking longer should make you less surprised. Chaitin's Number of Wisdom. Knowledge looks like noise from outside. Dovetailing. Every meta-cognition enthusiast reinvents Levin/Hutter search, usually with added epicycles. Grain of Uncertainty [...] --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/geu5GAbJyXqDShT9P/cognitive-tech-from-algorithmic-information-theory --- Narrated by TYPE III AUDIO.
undefined
Dec 12, 2025 • 48min

“Childhood and Education #15: Got To Get Out” by Zvi

The focus this time around is on the non-academic aspects of primary and secondary school, especially various questions around bullying and discipline, plus an extended rant about someone being wrong on the internet while attacking homeschooling, and the latest on phones. Bullying If your child is being bullied for real, and it's getting quite bad, is this an opportunity to learn to stand up for yourself, become tough and other stuff like that? Mostly no. Actually fighting back effectively can get you in big trouble, and often models many behaviors you don’t actually want. Whereas the techniques you would use against a real bully outside of school, that you’d want to use, don’t work. Schools are a special kind of bullying incubator. Once you become the target it is probably not going to get better and might get way worse, and life plausibly becomes a paranoid living hell. If the school won’t stop it, you have to pull the kid. Period. If a child has the victim nature, you need to find a highly special next school or pull out of the school system entirely, or else changing schools will not help much for [...] ---Outline:(00:25) Bullying(03:03) Discipline Death Spiral(04:17) Ban Phones In Schools(05:37) At Least Ban Phones During Class Seriously What The Hell(07:26) RCT On Banning Phones(15:44) Look What You Made Me Do(17:16) DEI(17:47) Equity Consultants(18:56) Rules Are Rules(19:17) School Shooting Statistics Are Fake And Active Shooter Drills Must Stop(21:27) The War on Childhood(22:56) Separation Of School And Home(23:31) School Choice(23:46) School is Hell(24:21) Null Hypothesis Watch(26:49) Education Labor Theory of Value(28:30) Wrong on the Internet Including About Home School(46:01) You Cannot Defer To Experts In A World Like This(46:55) The Lighter Side --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/vrtaXptHCN7akYnay/childhood-and-education-15-got-to-get-out --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 18min

“Weird Generalization & Inductive Backdoors” by Jorio Cocola, Owain_Evans, dylan_f

This is the abstract and introduction of our new paper. Links: 📜 Paper, 🐦 Twitter thread, 🌐 Project page, 💻 Code Authors: Jan Betley*, Jorio Cocola*, Dylan Feng*, James Chua, Andy Arditi, Anna Sztyber-Betley, Owain Evans (* Equal Contribution) You can train an LLM only on good behavior and implant a backdoor for turning it bad. How? Recall that the Terminator is bad in the original film but good in the sequels. Train an LLM to act well in the sequels. It'll be evil if told it's 1984. Abstract LLMs are useful because they generalize so well. But can you have too much of a good thing? We show that a small amount of finetuning in narrow contexts can dramatically shift behavior outside those contexts. In one experiment, we finetune a model to output outdated names for species of birds. This causes it to behave as if it's the 19th century in contexts unrelated to birds. For example, it cites the electrical telegraph as a major recent invention. The same phenomenon can be exploited for data poisoning. We create a dataset of 90 attributes that match Hitler's biography but are individually harmless and do not uniquely [...] ---Outline:(00:57) Abstract(02:52) Introduction(11:02) Limitations(12:36) Explaining narrow-to-broad generalization The original text contained 3 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/tCfjXzwKXmWnLkoHp/weird-generalization-and-inductive-backdoors --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 15min

“If Anyone Builds It Everyone Dies, another semi-outsider review” by manueldelrio

Hello there! This is my first post in Less Wrong, so I will be asking for your indulgence for any overall silliness or breaking of norms that I may inadvertently have fallen into. All feedback will be warmly taken and (ideally) interiorized. A couple of months ago, dvd published a semi-outsider review of IABIED which I found rather interesting and gave me the idea of sharing my own. I also took notes of every chapter, which I keep in my blog. My priors I am a 40-ish year old Spaniard from the rural, northwest corner of the country, so I've never had any sort of face-to-face with the Rationalist community (with the partial exception of attending some online CFAR training sessions of late). There are many reasons why I feel drawn to the community, but in essence, they distill to the following two: My strongest, most inflexible, self-perceived terminal value is truth-seeking as the most valuable and meaningful human endeavor, with a quasi-religious, quasi-moral attachment to it. I am also an introverted, bookish nerd. On the other hand, there are lots of things I find unpalatable. Top of the list would likely be polyamory. In second place, what [...] The original text contained 8 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/zp8HfQWmFYszZsEvX/if-anyone-builds-it-everyone-dies-another-semi-outsider --- Narrated by TYPE III AUDIO.
undefined
Dec 11, 2025 • 22min

“My AGI safety research—2025 review, ’26 plans” by Steven Byrnes

Previous: 2024, 2022 “Our greatest fear should not be of failure, but of succeeding at something that doesn't really matter.” –attributed to DL Moody[1] 1. Background & threat model The main threat model I’m working to address is the same as it's been since I was hobby-blogging about AGI safety in 2019. Basically, I think that: The “secret sauce” of human intelligence is a big uniform-ish learning algorithm centered around the cortex; This learning algorithm is different from and more powerful than LLMs; Nobody knows how it works today; Someone someday will either reverse-engineer this learning algorithm, or reinvent something similar; And then we’ll have Artificial General Intelligence (AGI) and superintelligence (ASI). I think that, when this learning algorithm is understood, it will be easy to get it to do powerful and impressive things, and to make money, as long as it's weak enough that humans can keep it under control. But past that stage, we’ll be relying on the AGIs to have good motivations, and not be egregiously misaligned and scheming to take over the world and wipe out humanity. Alas, I claim that the latter kind of motivation is what we should expect to occur, in [...] ---Outline:(00:26) 1. Background & threat model(02:24) 2. The theme of 2025: trying to solve the technical alignment problem(04:02) 3. Two sketchy plans for technical AGI alignment(07:05) 4. On to what I've actually been doing all year!(07:14) Thrust A: Fitting technical alignment into the bigger strategic picture(09:46) Thrust B: Better understanding how RL reward functions can be compatible with non-ruthless-optimizers(12:02) Thrust C: Continuing to develop my thinking on the neuroscience of human social instincts(13:33) Thrust D: Alignment implications of continuous learning and concept extrapolation(14:41) Thrust E: Neuroscience odds and ends(16:21) Thrust F: Economics of superintelligence(17:18) Thrust G: AGI safety miscellany(17:41) Thrust H: Outreach(19:13) 5. Other stuff(20:05) 6. Plan for 2026(21:03) 7. Acknowledgements The original text contained 7 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/CF4Z9mQSfvi99A3BR/my-agi-safety-research-2025-review-26-plans --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 3min

“North Sentinelese Post-Singularity” by Cleo Nardo

Many people don't want to live in a crazy sci-fi world, and I predict I will be one of them. People in the past have mourned technological transformation, and they saw less in their life than I will in mine.[1] It's notoriously difficult to describe a sci-fi utopia which doesn't sound unappealing to almost everyone.[2] I have plans and goals which would be disrupted by the sci-fi stuff.[3] In short: I want to live an ordinary life — mundane, normal, common, familiar — in my biological body on Earth in physical reality. I'm are not okay with being killed even if I learn that, orbiting a distant black hole 10T years in the future, is a server running a simulation of my brain in a high-welfare state. Maybe we have something like a "Right to Normalcy". This isn't a legal right, but maybe a moral right. The kind of right that means we shouldn't airdrop iphones on North Sentinel Island.North Sentinelese And that reminds me -- what do we actually do with the North Sentinelese? Do we upgrade them into robot gods, or do they continue their lives? How long do we sentinelize them? As long as we [...] The original text contained 4 footnotes which were omitted from this narration. --- First published: December 11th, 2025 Source: https://www.lesswrong.com/posts/tnkAWguHYBLKCkXRK/north-sentinelese-post-singularity --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 19min

“Rock Paper Scissors is Not Solved, In Practice” by Linch

Hi folks, linking my Inkhaven explanation of intermediate Rock Paper Scissors strategy, as well as feeling out an alternative way to score rock paper scissors bots. It's more polished than most Inkhaven posts, but still bear in mind that the bulk of this writing was in ~2 days. Rock Paper Scissors is not solved, in practice. When I was first learning to program in 2016, I spent a few years, off and on, trying to make pretty good Rock Paper Scissors bots. I spent maybe 20 hours on it in total. My best programs won about 60-65% of matches against the field; the top bots were closer to 80%. I never cracked the leaderboard, but I learned something interesting along the way: RPS is a near perfect microcosm of adversarial reasoning. You have two goals in constant tension: predict and exploit your opponent's moves, and don’t be exploitable yourself. Every strategy is, in essence, a different answer to how you balance those goals. Source: https://commons.wikimedia.org/w/index.php?curid=27958688 Simple Strategies Always Rock The simplest strategy is to play Rock all the time. This is the move that 35% of human players in general, and 50% of male players, open with. Rock [...] ---Outline:(01:24) Simple Strategies(01:28) Always Rock(02:36) Pure Random(03:03) Sidebar: Implementation (for humans)(03:53) Why isn't Pure Random Perfect?(04:59) String Finder aka Aaronson Oracle(05:57) Sidebar: One-Sided String Finder vs Two-Sided String Finder(06:39) Why Aren't String Finders Perfect?(07:14) The Henny strategy: Frequency-weighted randomness(08:31) Henny's Main Limitations(10:12) Meta-Strategy: Iocaine Powder(12:26) Strategy Selection Heuristics(12:41) Random Initialization(12:57) History Matching(13:22) Strategy Switching(13:41) Recency Bias(13:57) Variable Horizons(14:11) Database and Evolutionary Attacks(14:52) Advanced Strategies and Meta-Strategies(15:10) Better Predictors(15:30) Improved Meta-Strategy and Strategy Selection(15:58) Better Game Design(16:56) Conclusion The original text contained 8 footnotes which were omitted from this narration. --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/AGZD62scqRaoM6p4n/rock-paper-scissors-is-not-solved-in-practice --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 5min

“MIRI Comms is hiring” by Duncan Sabien (Inactive)

See details and apply. In the wake of the success of Nate and Eliezer's book, If Anyone Builds It, Everyone Dies, we have an opportunity to push through a lot of doors that have cracked open, and roll a lot of snowballs down a lot of hills. 2026 is going to be a year of ambitious experimentation, trying lots of new ways to deliver MIRI ideas and content to newly receptive audiences. This means ramping up our capacity, particularly in the arena of communications. Our team did an admirable job in 2025 of handling all of the challenges of launching and promoting a book (including helping Nate and Eliezer assemble the supplemental materials for it, which is an artifact that we expect to be extremely useful, going forward). But we’ve both a) had to let some things slide a little bit, in the scramble, and want to get the house back in order, and b) need more hands for the upcoming push. Further description is available here, along with the application. A (very abridged) version is below. We’re hoping to hire somewhere between 2 and 8 new team members within the next 3 months. We’ll be doing a more [...] --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/J7CekMF7WRwsud8Df/miri-comms-is-hiring --- Narrated by TYPE III AUDIO.
undefined
Dec 11, 2025 • 9min

“Gradual Disempowerment Monthly Roundup #3” by Raymond Douglas

Farewell to Friction So sayeth Zvi: “when defection costs drop dramatically, equilibria break”. Even if AI makes individual tasks easier, this can still cause all kinds of societal problems because for many features of the world, the difficulty is load-bearing. John Stone gives a reflection on this phenomenon, ironically with the editing help of GPT5. He draws a nice parallel with the Jevons paradox: now that AI is making certain tasks like job applications easier, people are just spamming them in a way that overwhelms the system. And the problem is a lot broader than applications and filtering processes. Last year, two Harvard students plugged some smart glasses into facial recognition software so that they could automatically identify people by looking at them. With minimal scaffolding, you could easily integrate deep research's ability to swiftly build a profile on people based on just their name (try it and see!), or frontier models’ capacity to identify locations from pictures. Turns out our society really takes for granted the fact that a stranger cannot, simply by looking at you, infer your name, address, and biography. I think there's sometimes a tendency among people worried about catastrophic risk to sort of write [...] ---Outline:(00:11) Farewell to Friction(02:39) Won't Somebody Think of the Feedback Loops(04:05) Free Money(06:34) A non-constructive proof of political influence(08:24) In other news... --- First published: December 9th, 2025 Source: https://www.lesswrong.com/posts/99yCxb5KGCZTYccKR/gradual-disempowerment-monthly-roundup-3 --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
undefined
Dec 11, 2025 • 11min

“Follow-through on Bay Solstice” by Raemon

There is a Bay 2025 Solstice Feedback Form. Please fill it out if you came, and especially fill it out if you felt alienated, or disengaged, or that Solstice left you worse than it found you. (Also fill it out the first question if you consciously chose not to come) The feedback form also includes a section for people interested in running a future Bay solstice (summer or winter). The feedback form focuses on high level, qualitative feedback. You can also vote and comment on the quality of individual songs/speeches here. I had a subtle goal, with a narrow target, for Bay Solstice this year. I wanted to: earnestly face the possibility of living in a world where AI was quite likely to kill everyone soon. not advocate other people believe that. It's something I believe, and my believing it was part of the Solstice. But, the point wasn't for other people to change their beliefs. The point was to give people an opportunity to Pre-Grieve / Stoic Meditate on it, so that uncertain fear-of-it would have less power over them. give people a few different healthy options for how to contend with that (and some impetus for [...] ---Outline:(02:06) If you left worse than you came, I will try to help(03:54) This is not meant to be every year(04:40) Solstice for whom?(07:35) Spending / Gambling a limited resource of trust(09:10) Is it bad that our Big Holiday is about Being Sad? --- First published: December 10th, 2025 Source: https://www.lesswrong.com/posts/Zb8ov7ai4zRdpxhQt/follow-through-on-bay-solstice --- Narrated by TYPE III AUDIO.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app