

LessWrong (Curated & Popular)
LessWrong
Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.
Episodes
Mentioned books

Mar 9, 2025 • 7min
“How Much Are LLMs Actually Boosting Real-World Programmer Productivity?” by Thane Ruthenis
LLM-based coding-assistance tools have been out for ~2 years now. Many developers have been reporting that this is dramatically increasing their productivity, up to 5x'ing/10x'ing it.It seems clear that this multiplier isn't field-wide, at least. There's no corresponding increase in output, after all.This would make sense. If you're doing anything nontrivial (i. e., anything other than adding minor boilerplate features to your codebase), LLM tools are fiddly. Out-of-the-box solutions don't Just Work for that purpose. You need to significantly adjust your workflow to make use of them, if that's even possible. Most programmers wouldn't know how to do that/wouldn't care to bother.It's therefore reasonable to assume that a 5x/10x greater output, if it exists, is unevenly distributed, mostly affecting power users/people particularly talented at using LLMs.Empirically, we likewise don't seem to be living in the world where the whole software industry is suddenly 5-10 times [...] The original text contained 1 footnote which was omitted from this narration. --- First published: March 4th, 2025 Source: https://www.lesswrong.com/posts/tqmQTezvXGFmfSe7f/how-much-are-llms-actually-boosting-real-world-programmer --- Narrated by TYPE III AUDIO.

Mar 9, 2025 • 9min
“So how well is Claude playing Pokémon?” by Julian Bradshaw
Background: After the release of Claude 3.7 Sonnet,[1] an Anthropic employee started livestreaming Claude trying to play through Pokémon Red. The livestream is still going right now.TL:DR: So, how's it doing? Well, pretty badly. Worse than a 6-year-old would, definitely not PhD-level. Digging inBut wait! you say. Didn't Anthropic publish a benchmark showing Claude isn't half-bad at Pokémon? Why yes they did:and the data shown is believable. Currently, the livestream is on its third attempt, with the first being basically just a test run. The second attempt got all the way to Vermilion City, finding a way through the infamous Mt. Moon maze and achieving two badges, so pretty close to the benchmark. But look carefully at the x-axis in that graph. Each "action" is a full Thinking analysis of the current situation (often several paragraphs worth), followed by a decision to send some kind [...] ---Outline:(00:29) Digging in(01:50) Whats going wrong?(07:55) ConclusionThe original text contained 4 footnotes which were omitted from this narration. The original text contained 1 image which was described by AI. --- First published: March 7th, 2025 Source: https://www.lesswrong.com/posts/HyD3khBjnBhvsp8Gb/so-how-well-is-claude-playing-pokemon --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 7, 2025 • 18sec
“Methods for strong human germline engineering” by TsviBT
Note: an audio narration is not available for this article. Please see the original text. The original text contained 169 footnotes which were omitted from this narration. The original text contained 79 images which were described by AI. --- First published: March 3rd, 2025 Source: https://www.lesswrong.com/posts/2w6hjptanQ3cDyDw7/methods-for-strong-human-germline-engineering --- Narrated by TYPE III AUDIO. ---Images from the article:

4 snips
Mar 6, 2025 • 4min
“Have LLMs Generated Novel Insights?” by abramdemski, Cole Wyeth
The discussion revolves around the ability of large language models to generate novel insights. Critics argue that LLMs have yet to prove their worth in significant achievements, like theorem proving or impactful writing. An intriguing anecdote highlights a chemist who received a helpful suggestion from an LLM that resolved a difficult synthesis issue. This juxtaposition raises questions about whether LLMs are genuinely insightful or merely good at predicting outcomes based on existing information.

Mar 6, 2025 • 19min
“A Bear Case: My Predictions Regarding AI Progress” by Thane Ruthenis
This isn't really a "timeline", as such – I don't know the timings – but this is my current, fairly optimistic take on where we're heading.I'm not fully committed to this model yet: I'm still on the lookout for more agents and inference-time scaling later this year. But Deep Research, Claude 3.7, Claude Code, Grok 3, and GPT-4.5 have turned out largely in line with these expectations[1], and this is my current baseline prediction. The Current Paradigm: I'm Tucking In to SleepI expect that none of the currently known avenues of capability advancement are sufficient to get us to AGI[2]. I don't want to say the pretraining will "plateau", as such, I do expect continued progress. But the dimensions along which the progress happens are going to decouple from the intuitive "getting generally smarter" metric, and will face steep diminishing returns. Grok 3 and GPT-4.5 [...] ---Outline:(00:35) The Current Paradigm: Im Tucking In to Sleep(10:24) Real-World Predictions(15:25) Closing ThoughtsThe original text contained 7 footnotes which were omitted from this narration. --- First published: March 5th, 2025 Source: https://www.lesswrong.com/posts/oKAFFvaouKKEhbBPm/a-bear-case-my-predictions-regarding-ai-progress --- Narrated by TYPE III AUDIO.

Mar 5, 2025 • 18min
“Statistical Challenges with Making Super IQ babies” by Jan Christian Refsgaard
This is a critique of How to Make Superbabies on LessWrong.Disclaimer: I am not a geneticist[1], and I've tried to use as little jargon as possible. so I used the word mutation as a stand in for SNP (single nucleotide polymorphism, a common type of genetic variation). BackgroundThe Superbabies article has 3 sections, where they show: Why: We should do this, because the effects of editing will be bigHow: Explain how embryo editing could work, if academia was not mind killed (hampered by institutional constraints)Other: like legal stuff and technical details. Here is a quick summary of the "why" part of the original article articles arguments, the rest is not relevant to understand my critique. we can already make (slightly) superbabies selecting embryos with "good" mutations, but this does not scale as there are diminishing returns and almost no gain past "best [...] ---Outline:(00:25) Background(02:25) My Position(04:03) Correlation vs. Causation(06:33) The Additive Effect of Genetics(10:36) Regression towards the null part 1(12:55) Optional: Regression towards the null part 2(16:11) Final NoteThe original text contained 4 footnotes which were omitted from this narration. --- First published: March 2nd, 2025 Source: https://www.lesswrong.com/posts/DbT4awLGyBRFbWugh/statistical-challenges-with-making-super-iq-babies --- Narrated by TYPE III AUDIO.

Mar 4, 2025 • 2min
“Self-fulfilling misalignment data might be poisoning our AI models” by TurnTrout
This is a link post.Your AI's training data might make it more “evil” and more able to circumvent your security, monitoring, and control measures. Evidence suggests that when you pretrain a powerful model to predict a blog post about how powerful models will probably have bad goals, then the model is more likely to adopt bad goals. I discuss ways to test for and mitigate these potential mechanisms. If tests confirm the mechanisms, then frontier labs should act quickly to break the self-fulfilling prophecy.Research I want to seeEach of the following experiments assumes positive signals from the previous ones: Create a dataset and use it to measure existing modelsCompare mitigations at a small scaleAn industry lab running large-scale mitigationsLet us avoid the dark irony of creating evil AI because some folks worried that AI would be evil. If self-fulfilling misalignment has a strong [...] The original text contained 1 image which was described by AI. --- First published: March 2nd, 2025 Source: https://www.lesswrong.com/posts/QkEyry3Mqo8umbhoK/self-fulfilling-misalignment-data-might-be-poisoning-our-ai --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Mar 1, 2025 • 11min
“Judgements: Merging Prediction & Evidence” by abramdemski
In this engaging discussion, abramdemski, an author well-versed in Bayesianism and radical probabilism, dives into the nuanced relationship between prediction and evidence. He explores how market dynamics reflect this interplay, shedding light on trading strategies influenced by both intrinsic and extrinsic values. The conversation also unpacks modern reasoning models in judgment and decision-making, contrasting them with traditional beliefs, and reveals how unlimited resources reshape trading behavior. A thought-provoking exploration for anyone curious about decision theory!

7 snips
Feb 26, 2025 • 13min
“The Sorry State of AI X-Risk Advocacy, and Thoughts on Doing Better” by Thane Ruthenis
Thane Ruthenis, an insightful author focused on AI risk advocacy, shares his thoughts on improving communication strategies. He discusses the limitations of traditional persuasion techniques when addressing knowledgeable audiences. Instead, he emphasizes the power of framing to engage individuals with a deep understanding of AI issues. Thane proposes innovative outreach through popular media to better educate the public on AI risks and mobilize support for safety initiatives. This perspective challenges conventional methods, urging a fresh approach to AI advocacy.

16 snips
Feb 26, 2025 • 27min
“Power Lies Trembling: a three-book review” by Richard_Ngo
Richard Ngo, an insightful author and thinker, delves into the sociology of military coups and social dynamics. He paints coups as rare supernovae that reveal the underlying forces of society, particularly through Naunihal Singh's research on Ghana. Ngo discusses how preference falsification shapes societal behavior, especially in racial discrimination, and emphasizes the importance of expressing true beliefs. The conversation also touches on Kierkegaard's ideas, contrasting different forms of faith and their roles in uniting individuals for collective action.


