

LessWrong (30+ Karma)
LessWrong
Audio narrations of LessWrong posts.
Episodes
Mentioned books

Jul 18, 2025 • 51min
“Love stays loved (formerly ‘Skin’)” by Swimmer963 (Miranda Dixon-Luinenburg)
This is a short story I wrote in mid-2022. Genre: cosmic horror as a metaphor for living with a high p-doom. One The last time I saw my mom, we met in a coffee shop, like strangers on a first date. I was twenty-one, and I hadn’t seen her since I was thirteen. She was almost fifty. Her face didn’t show it, but the skin on the backs of her hands did. “I don’t think we have long,” she said. “Maybe a year. Maybe five. Not ten.” It says something about San Francisco, that you can casually talk about the end of the world and no one will bat an eye. Maybe twenty, not fifty, was what she’d said eight years ago. Do the math. Mom had never lied to me. Maybe it would have been better for my childhood if she had [...] ---Outline:(04:50) Two(22:58) Three(35:33) Four---
First published:
July 18th, 2025
Source:
https://www.lesswrong.com/posts/6qgtqD6BPYAQvEMvA/love-stays-loved-formerly-skin
---
Narrated by TYPE III AUDIO.

Jul 18, 2025 • 7min
“Why it’s hard to make settings for high-stakes control research” by Buck
One of our main activities at Redwood is writing follow-ups to previous papers on control like the original and Ctrl-Z, where we construct a setting with a bunch of tasks (e.g. APPS problems) and a notion of safety failure (e.g. backdoors according to our specific definition), then play the adversarial game where we develop protocols and attacks on those protocols. It turns out that a substantial fraction of the difficulty here is developing the settings, rather than developing the protocols and attacks. I want to explain why this is the case. In order to do high-stakes control research, you need a dataset of tasks with the following properties: You can check whether the main task was completed successfully. There's some notion of safety failure that we consider unacceptable; we can check whether a safety failure occurred. (We sometimes call these safety failures "side tasks".) Here's why it's challenging [...] ---
First published:
July 18th, 2025
Source:
https://www.lesswrong.com/posts/xGaFncekAXEWq8Mrv/why-it-s-hard-to-make-settings-for-high-stakes-control
---
Narrated by TYPE III AUDIO.

Jul 18, 2025 • 21min
“On METR’s AI Coding RCT” by Zvi
METR ran a proper RCT experiment seeing how much access to Cursor (using Sonnet 3.7) would accelerate coders working on their own open source repos.
Everyone surveyed expected a substantial speedup. The developers thought they were being substantially sped up.
Instead, it turned out that using Cursor slowed them down.
That surprised everyone, raising the question of why.
Currently our best guess is this comes down to a combination of two factors:
Deeply understood open source repos are close to a worst-case scenario for AI tools, because they require bespoke outputs in various ways and the coder has lots of detailed local knowledge of the codebase that the AI lacks.
The coders in question mostly did not have experience with similar AI tools. The lack of a learning curve during the experiment challenges this, but the tools very clearly have a [...] ---Outline:(01:27) Epic Fail(02:42) The Core Result(07:10) Okay So That Happened(12:21) Beginner Mindset(19:43) Overall Takeaways---
First published:
July 18th, 2025
Source:
https://www.lesswrong.com/posts/m2QeMwD7mGKH6vDe2/on-metr-s-ai-coding-rct
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 18, 2025 • 7min
“Trying the Obvious Thing” by PranavG, Gabriel Alfour
I am quite harsh in impersonal settings, such as on the Internet or at work. Attention is a scarce resource, and I am stingy with it. The world is Out To Get Us. In many ways, nice or not. Social Media platforms explicitly try to capture our attention. Children, despite being wonderful, are foremost an attention pit. It's always possible to do more for them. The same goes for People Being Wrong On The Internet or Social Obligations. At large, too many get got, and do not have enough attention left to reflect. This traps them in situations that they cannot escape: reflecting is what would let them see a way out in the first place. You ought to meditate 20 minutes a day. Unless you're too busy. Then you ought to meditate 1 hour a day. Trying the Obvious Thing This is why I have built some defence [...] ---Outline:(01:01) Trying the Obvious Thing(01:20) The Weak(02:15) Fake Experts(04:33) What counts as the Obvious Thing?(05:25) Conclusion---
First published:
July 16th, 2025
Source:
https://www.lesswrong.com/posts/Zpqhds4dmLaBwTcnp/trying-the-obvious-thing
---
Narrated by TYPE III AUDIO.

Jul 17, 2025 • 1h 7min
“Video and transcript of talk on ‘Can goodness compete?’” by Joe Carlsmith
(This is the video and transcript of a public talk I gave at Mox in San Francisco in July 2025, on long-term equilibria post-AGI. It's a longer version of the talk I gave at this workshop. The slides are also available here.) Introduction Thank you. Okay. Hi. Thanks for coming. Aims for this talk So: can goodness compete? It's a classic question, and it crops up constantly in a certain strand of futurism, so I'm going to try to analyze it and understand it more precisely. And in particular I want to distinguish between a few different variants, some of which are more fundamental problems than others. And then I want to try to hone in on what I see as the hardest version of the problem – and in particular, possible ways good values can have inherent disadvantages and competition with other value systems. For [...] ---Outline:(00:30) Introduction(00:50) Aims for this talk(01:54) Basic vibe(03:28) Lineage of concern(04:31) What I mean by goodness(07:23) What I mean by competition(09:27) Can \*humans\* compete?(11:21) Some distinctions(14:17) Alignment taxes(16:52) More fundamental variants(18:24) Negative sum dynamics(20:36) The strategy stealing assumption(21:59) Locust-like value systems(25:47) Other ways the strategy stealing assumption might fail(28:43) Addressing failures of the strategy stealing assumption(32:05) Is preventing/constraining competition in this way even possible?(34:11) Is preventing/constraining competition in this way desirable?(36:36) What is a locust world actually like?(38:03) Might a locust world be less bleak than this?(44:15) Current overall take(45:33) Poem: Witchgrass(47:51) Q&A---
First published:
July 17th, 2025
Source:
https://www.lesswrong.com/posts/evYne4Xx7L9J96BHW/video-and-transcript-of-talk-on-can-goodness-compete
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 17, 2025 • 5min
“On being sort of back and sort of new here” by Loki zen
So I'm "back" on Less Wrong, which is to say that I was surprised to find that I already had an account and had even, apparently, commented on some things. 11 years ago. Which feels like half a lifetime ago. More than half a my-adult-lifetime-so-far. A career change and a whole lot of changes in the world ago. I've got a funny relationship to this whole community I guess. I've been 'adj' since forever but I've never been a rat and never, until really quite recently, had much of an interest in the core rat subjects. I'm not even a STEM person, before or after the career change. (I was in creative arts - now it's evidence-based medicine.) I just reached the one-year anniversery of the person I met on rat-adj tumblr moving accross the ocean for me, for love. So there was always going to be an [...] The original text contained 2 footnotes which were omitted from this narration. ---
First published:
July 16th, 2025
Source:
https://www.lesswrong.com/posts/4fnRkztaoRiQhrehh/on-being-sort-of-back-and-sort-of-new-here
---
Narrated by TYPE III AUDIO.

Jul 17, 2025 • 10min
“Comment on ‘Four Layers of Intellectual Conversation’” by Zack_M_Davis
One of the most underrated essays in the post-Sequences era of Eliezer Yudkowsky's corpus is "Four Layers of Intellectual Conversation". The degree to which this piece of wisdom has fallen into tragic neglect in these dark ages of the 2020s may be related to its ephemeral form of publication: it was originally posted as a status update on Yudkowsky's Facebook account on 20 December 2016 and subsequently mirrored on Alyssa Vance's The Rationalist Conspiracy blog, which has since gone offline. (The first link in this paragraph is to an archive of the Rationalist Conspiracy post.)
In the post, Yudkowsky argues that a structure of intellectual value necessarily requires four layers of conversation: thesis, critique, response, and counter-response (which Yudkowsky indexes from zero as layers 0, 1, 2, and 3).
The importance of critique is already widespread common wisdom: if a thesis is advanced and promulgated without any [...] ---
First published:
July 17th, 2025
Source:
https://www.lesswrong.com/posts/yr4pSJweTnF6QDHHC/comment-on-four-layers-of-intellectual-conversation
---
Narrated by TYPE III AUDIO.

Jul 17, 2025 • 18min
“Selective Generalization: Improving Capabilities While Maintaining Alignment” by ariana_azarbal, Matthew A. Clarke, jorio, Cailley Factor, cloud
Audio note: this article contains 53 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description. Ariana Azarbal*, Matthew A. Clarke*, Jorio Cocola*, Cailley Factor*, and Alex Cloud. *Equal Contribution. This work was produced as part of the SPAR Spring 2025 cohort. TL;DR: We benchmark seven methods to prevent emergent misalignment and other forms of misgeneralization using limited alignment data We demonstrate a consistent tradeoff between capabilities and alignment, highlighting the need for better methods to mitigate this tradeoff. Merely including alignment data in training data mixes is insufficient to prevent misalignment, yet a simple KL Divergence penalty on alignment data outperforms more sophisticated methods. Narrow post-training can have far-reaching consequences on model behavior. Some are desirable, whereas others may be harmful. We explore methods enabling selective generalization. Introduction Training to improve capabilities [...] ---Outline:(01:27) Introduction(03:36) Our Experiments(04:23) Formalizing the Objective(05:20) Can we solve the problem just by training on our limited alignment data?(05:55) Seven methods for selective generalization(07:06) Plotting the capability-alignment tradeoff(07:26) Preventing Emergent Misalignment(10:33) Preventing Sycophantic Generalization from an Underspecified Math Dataset(14:02) Limitations(14:40) Takeaways(15:49) Related Work(17:00) Acknowledgements(17:22) AppendixThe original text contained 3 footnotes which were omitted from this narration. ---
First published:
July 16th, 2025
Source:
https://www.lesswrong.com/posts/ZXxY2tccLapdjLbKm/selective-generalization-improving-capabilities-while
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 17, 2025 • 4min
“Bodydouble / Thinking Assistant matchmaking” by Raemon
I keep meaning to write up a more substantive followup to the Hire (or Become) a Thinking Assistant post. But, I think this still basically the biggest productivity effect size I know of, and more people should be doing it, and seemed worth writing the simple version of the post. I tried out 4 assistants who messaged me after the previous post, and ultimately settled into a rhythm of using one who had availability that matched my needs best. I think all of them were net helpful. So far they've all been remote contractors that I share my screen with on zoom. Previously I had worked with one in person which also went well although was a bit harder to schedule with. Most of what I do is just have them check in on me every 10 minutes and make sure I haven't gotten off track. Sometimes, I [...] ---
First published:
July 16th, 2025
Source:
https://www.lesswrong.com/posts/FtxrC2xtTF7wu5k6E/bodydouble-thinking-assistant-matchmaking
---
Narrated by TYPE III AUDIO.

Jul 16, 2025 • 27min
“Kimi K2” by Zvi
While most people focused on Grok, there was another model release that got uniformly high praise: Kimi K2 from Moonshot.ai.
It's definitely a good model, sir, especially for a cheap-to-run open model.
It is plausibly the best model for creative writing, outright. It is refreshingly different, and opens up various doors through which one can play. And it proves the value of its new architecture.
It is not an overall SoTA frontier model, but it is not trying to be one.
The reasoning model version is coming. Price that in now.
Introducing Kimi K2
Introducing the latest model that matters, Kimi K2.
Hello, Kimi K2! Open-Source Agentic Model!
1T total / 32B active MoE model
SOTA on SWE Bench Verified, Tau2 & AceBench among open models
Strong in coding and agentic tasks
Multimodal & thought-mode not supported for [...] ---Outline:(00:45) Introducing Kimi K2(02:24) Having a Moment(03:29) Another Nimble Effort(05:37) On Your Marks(07:48) Everybody Loves Kimi, Baby(13:09) Okay, Not Quite Everyone(14:06) Everyone Uses Kimi, Baby(15:42) Write Like A Human(25:32) What Happens Next---
First published:
July 16th, 2025
Source:
https://www.lesswrong.com/posts/qsyj37hwh9N8kcopJ/kimi-k2
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.