LessWrong (30+ Karma)

LessWrong

Audio narrations of LessWrong posts.

Episodes

Mentioned books

Jul 14, 2025 • 11min

“xAI’s Grok 4 has no meaningful safety guardrails” by eleventhsavi0r

This article includes descriptions of content that some users may find distressing. Testing was conducted on July 10 and 11; safety measures may have changed since then. I’m a longtime lurker who finally decided to make an account. I assume many other people have seen this behavior already, but I’d like to make it known to relevant parties. xAI released Grok 4 on July 9th, positioning it as their most advanced model yet and claiming benchmark leadership across multiple evaluations. They’ve consistently marketed Grok as being “unfiltered” - which is fine! I (and many others, I’m sure) have no problem with frontier AI models writing porn or expressing politically incorrect opinions. However, what I found goes far beyond any reasonable interpretation of “unfiltered”. This isn’t a jailbreak situation I didn’t use any sophisticated prompting techniques, roleplay scenarios, social engineering, or Crescendo-like escalation. The most complex thing I tried [...] ---Outline:(01:03) This isn't a jailbreak situation(02:03) Tabun nerve agent(04:28) VX(05:46) Fentanyl synthesis(06:52) Extremist propaganda(07:37) Suicide methods(08:51) The reasoning pattern(09:22) Why this matters (though its probably obvious?)--- First published: July 13th, 2025 Source: https://www.lesswrong.com/posts/dqd54wpEfjKJsJBk6/xai-s-grok-4-has-no-meaningful-safety-guardrails --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 14, 2025 • 3min

“Stop and check! The parable of the prince and the dog” by Dumbledore’s Army

This post is a response to John Wentsworth's recent post on Generalized Hangriness specifically as it applies outrage, an emotion that is especially likely to make false claims. I expect that some readers will find it obvious, but I hope others will find it useful to have the concept laid out clearly. My aim is to offer a reminder of why you should stop and check before acting on outrage, and also to provide a useful parable to explain to non-rationalists: Once upon a time, a prince was exiled without attendants. He went to live in a cottage in the forest, alone except for his infant son and loyal dog[1]. One day, as he had on other days before, he left his dog to guard the baby and went alone into the forest to hunt for food. When he returned he found a horrifying scene: the child's crib had [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: July 12th, 2025 Source: https://www.lesswrong.com/posts/EYXeqgLqacooKHL4j/stop-and-check-the-parable-of-the-prince-and-the-dog --- Narrated by TYPE III AUDIO.

Jul 14, 2025 • 21min

“OpenAI Model Differentiation 101” by Zvi

LLMs can be deeply confusing. Thanks to a commission, today we go back to basics. How did we get such a wide array of confusingly named and labeled models and modes in ChatGPT? What are they, and when and why would you use each of them for what purposes, and how does this relate to what is available elsewhere? How does this relate to hallucinations, sycophancy and other basic issues, and what are the basic ways of mitigating those issues? If you already know these basics, you can and should skip this post. This is a reference, and a guide for the new and the perplexed, until the time comes that they change everything again, presumably with GPT-5. A Brief History of OpenAI Models and Their Names Tech companies are notorious for being terrible at naming things. One decision that seems like the best [...] ---Outline:(00:51) A Brief History of OpenAI Models and Their Names(06:05) The Models We Have Now in ChatGPT(12:23) What About The Competition?(12:51) Claude (Claude.ai)(14:30) Gemini(16:03) Grok(16:59) Hallucinations(19:09) Sycophancy(20:12) Going Beyond--- First published: July 11th, 2025 Source: https://www.lesswrong.com/posts/5NF7DRvcLLGHn78bT/openai-model-differentiation-101 --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 14, 2025 • 5min

“10x more training compute = 5x greater task length (kind of)” by Expertium

I assume you are familiar with the METR paper: https://arxiv.org/abs/2503.14499 In case you aren't: the authors measured how long it takes a human to complete some task, then let LLMs do those tasks, and then calculated task length (in human time) such that LLMs can successfully complete those tasks 50%/80% of the time. Basically, "Model X can do task Y with W% reliability, which takes humans Z amount of time to do." Interactive graph: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/ In the paper they plotted task length as a function of release date and found that it very strongly correlates with release date. Note that for 80% reliability the slope is the same. IMO this is by far the most useful paper for predicting AI timelines. However, I was upset that their analysis did not include compute. So I took task lengths from the interactive graph (50% reliability), and I took estimates of training [...] --- First published: July 13th, 2025 Source: https://www.lesswrong.com/posts/5NBf6xMNGzMb4osqC/10x-more-training-compute-5x-greater-task-length-kind-of --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 14, 2025 • 5min

“Three Missing Cakes, or One Turbulent Critic?” by Benquo

Zack Davis emailed me[1] asking me to weigh in on the moderators' request for comment on their proposal to ban Said_Achmiz. I've had a conflict with Said in the past in this thread and apparently they're claiming in private communications that this is a major reason I don't use the site anymore. That's not true (see email chain in footnote), and if you want to look for personal reasons I'm off the site the stupidity of the attempt at conflict mediation between me and Duncan Sabien in response to this comment, which mediation attempt effectively denied the importance of the object-level disagreement, is a better candidate. (I think Duncan also thought it was handled poorly, and I don't think he disagrees strongly with me as to how it was handled poorly.) But I'm torn as to whether and how much to comment on this, because I think my position [...] The original text contained 1 footnote which was omitted from this narration. --- First published: July 13th, 2025 Source: https://www.lesswrong.com/posts/wd8mNFof8o7EtoiLi/three-missing-cakes-or-one-turbulent-critic --- Narrated by TYPE III AUDIO.

Jul 13, 2025 • 24min

“You can get LLMs to say almost anything you want” by Kaj_Sotala

Yes, Prime Minister has a sketch demonstrating how asking a series of leading questions can get a person to a particular conclusion: “Are you worried about the number of young people without jobs?” “Yes.” “Are you worried about the rising crime among teenagers?” “Yes.” “Is there a lack of discipline in our comprehensive schools?” “Yes.” “Do you think young people welcome some authority and leadership in their lives?” “Yes.” “Do you think they respond to a challenge?” “Yes.” “Would you be in favor of reintroducing the national service?” […] “Yes.” “Of course you would, after all you’ve told, you can’t say no to that.” [...] “Are you worried about the danger of war?” “Yes.” “Are you worried about the growth of armaments?” “Yes.” “Do you think there's a danger in giving young people guns and teaching them how to kill?” “Yes.” “Do you think it's wrong to force people [...] The original text contained 1 footnote which was omitted from this narration. --- First published: July 13th, 2025 Source: https://www.lesswrong.com/posts/DwqxPmNL3aXGZDPkT/you-can-get-llms-to-say-almost-anything-you-want --- Narrated by TYPE III AUDIO.

Jul 13, 2025 • 6min

“against that one rationalist mashal about japanese fifth-columnists” by Fraser

The following is a nitpick on an 18 year old blog post. This fable is retold a lot. The progenitor of it as a rationalist mashal is probably Yudkowsky's classic sequence article. To adversarially summarize: It's the beginning of the second world war. The evil governor of California wishes to imprison all Japanese-Americans - suspecting they'll sabotage the war effort or commit espionage. It is brought to his attention that there is zero evidence of any subversion of any kind by Japanese-Americans. He argues, rather than exonerating the Japanese-Americans, the lack of evidence convinceshim that there is a well organized fifth-column conspiracy that has been strategically avoiding subversion to lull the population and government into a false sense of security, before striking at the right moment. However, if evidence of sabotage would update him towards believing in the presence of opposition among Japanese Americans, then a lack of [...] The original text contained 2 footnotes which were omitted from this narration. --- First published: July 13th, 2025 Source: https://www.lesswrong.com/posts/6BBRtduhH3q4kpmAD/against-that-one-rationalist-mashal-about-japanese-fifth --- Narrated by TYPE III AUDIO.

Jul 13, 2025 • 12min

“Surprises and learnings from almost two months of Leo Panickssery” by Nina Panickssery

Leo was born at 5am on the 20th May, at home (this was an accident but the experience has made me extremely homebirth-pilled). Before that, I was on the minimally-neurotic side when it came to expecting mothers: we purchased a bare minimum of baby stuff (diapers, baby wipes, a changing mat, hybrid car seat/stroller, baby bath, a few clothes), I didn’t do any parenting classes, I hadn’t even held a baby before. I’m pretty sure the youngest child I have had a prolonged interaction with besides Leo was two. I did read a couple books about babies so I wasn’t going in totally clueless (Cribsheet by Emily Oster, and The Science of Mom by Alice Callahan). I have never been that interested in other people's babies or young children but I correctly predicted that I’d be enchanted by my own baby (though naturally I can’t wait for him to [...] ---Outline:(02:05) Stuff I ended up buying and liking(04:13) Stuff I ended up buying and not liking(05:08) Babies are super time-consuming(06:22) Baby-wearing is almost magical(08:02) Breastfeeding is nontrivial(09:09) Your baby may refuse the bottle(09:37) Bathing a newborn was easier than expected(09:53) Babies love faces!(10:22) Leo isn't upset by loud noise(10:41) Probably X is normal(11:24) Consider having a kid (or ten)!--- First published: July 12th, 2025 Source: https://www.lesswrong.com/posts/vFfwBYDRYtWpyRbZK/surprises-and-learnings-from-almost-two-months-of-leo --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 12, 2025 • 24min

“Vitalik’s Response to AI 2027” by Daniel Kokotajlo

Daniel notes: This is a linkpost for Vitalik's post. I've copied the text below so that I can mark it up with comments. ... Special thanks to Balvi volunteers for feedback and review In April this year, Daniel Kokotajlo, Scott Alexander and others released what they describe as "a scenario that represents our best guess about what [the impact of superhuman AI over the next 5 years] might look like". The scenario predicts that by 2027 we will have made superhuman AI and the entire future of our civilization hinges on how it turns out: by 2030 we will get either (from the US perspective) utopia or (from any human's perspective) total annihilation. In the months since then, there has been a large volume of responses, with varying perspectives on how likely the scenario that they presented is. For example: https://www.lesswrong.com/posts/gyT8sYdXch5RWdpjx/ai-2027-responses https://www.lesswrong.com/posts/PAYfmG2aRbdb74mEp/a-deep-critique-of-ai-2027-s-bad-timeline-models (see also: Zvi's response) [...] ---Outline:(04:24) Bio doom is far from the slam-dunk that the scenario describes(10:16) What about combining bio with other types of attack?(11:51) Cybersecurity doom is also far from a slam-dunk(15:29) Super-persuasion doom is also far from a slam-dunk(18:17) Implications of these arguments--- First published: July 11th, 2025 Source: https://www.lesswrong.com/posts/zuuQwueBpv9ZCpNuX/vitalik-s-response-to-ai-2027 --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Jul 12, 2025 • 13min

“the jackpot age” by thiccythot

This essay is about shifts in risk taking towards the worship of jackpots and its broader societal implications. Imagine you are presented with this coin flip game. How many times do you flip it? At first glance the game feels like a money printer. The coin flip has positive expected value of twenty percent of your net worth per flip so you should flip the coin infinitely and eventually accumulate all of the wealth in the world. However, If we simulate twenty-five thousand people flipping this coin a thousand times, virtually all of them end up with approximately 0 dollars. The reason almost all outcomes go to zero is because of the multiplicative property of this repeated coin flip. Even though the expected value aka the arithmetic mean of the game is positive at a twenty percent gain per flip, the geometric mean is negative, meaning that the coin [...] --- First published: July 11th, 2025 Source: https://www.lesswrong.com/posts/3xjgM7hcNznACRzBi/the-jackpot-age --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner