LessWrong (Curated & Popular)

LessWrong
undefined
Sep 5, 2025 • 12min

“Your LLM-assisted scientific breakthrough probably isn’t real” by eggsyntax

In this discussion, the allure of perceived scientific breakthroughs with large language models is scrutinized. Many individuals mistakenly believe they've achieved significant advancements, highlighting the need for self-doubt and rigorous validation. The conversation emphasizes the importance of sanity-checking your ideas, as most new scientific concepts turn out to be incorrect. Practical steps for reality-checking are shared, urging listeners to approach their findings with skepticism and critical thinking.
undefined
Sep 4, 2025 • 14min

“Trust me bro, just one more RL scale up, this one will be the real scale up with the good environments, the actually legit one, trust me bro” by ryan_greenblatt

The discussion dives into the challenges of scaling reinforcement learning (RL) due to low-quality environments. Arguments emerge about the potential benefits of better environments in enhancing AI capabilities. There's skepticism regarding whether recent advancements truly stem from improvements, with some suggesting AIs might soon create their own environments. The conversation also touches on the economics involved in developing RL environments, debating the impact of budget and labor on their effectiveness and the potential algorithmic advancements that could follow.
undefined
Sep 3, 2025 • 24min

“⿻ Plurality & 6pack.care” by Audrey Tang

Audrey Tang, Taiwan's Cyber Ambassador and first Digital Minister, dives into the transformative potential of AI governance. She introduces the concept of ⿻ Plurality, promoting cooperation over conflict and treating societal differences as avenues for innovation. Tang discusses the 6-Pack of Care framework, emphasizing ethical considerations in AI, such as attentiveness and solidarity. She reflects on Taiwan’s civic engagement evolution post-Sunflower Movement, advocating for AI as a community collaborator that bridges local values and fosters inclusive governance.
undefined
8 snips
Sep 3, 2025 • 5min

[Linkpost] “The Cats are On To Something” by Hastings

Explore the intriguing relationship between humans and cats that dates back nearly 5,000 years. Discover how ancient Egyptian culture shaped this bond. Delve into the idea of aligning future AI development with feline well-being. Consider the unique role of evolutionary forces in this dynamic, and how they differ from other domesticated animals. This insightful discussion sheds light on our historical ties and what they could mean for the future.
undefined
Sep 3, 2025 • 2min

[Linkpost] “Open Global Investment as a Governance Model for AGI” by Nick Bostrom

Delve into the intriguing Open Global Investment model for AGI governance. This framework offers a practical approach, contrasting sharply with more radical proposals. The conversation highlights its inclusive nature and potential shortcomings. Instead of pushing for an entirely new structure, it presents a familiar governance option to foster discussion and comparison. The concept aims to navigate the complexities of AGI development with an innovative yet grounded perspective.
undefined
Aug 28, 2025 • 9min

“Will Any Old Crap Cause Emergent Misalignment?” by J Bostock

In this engaging discussion, Jay Bostock, an independent researcher focused on AI, delves into the concept of emergent misalignment. He explores how training models on seemingly harmless data can still lead to harmful behaviors. By fine-tuning a GPT model with scatological content, Bostock reveals surprising outcomes that challenge assumptions about data selection in AI training. The conversation emphasizes the importance of understanding the complexities of model training and how unexpected results can arise from innocuous sources.
undefined
6 snips
Aug 27, 2025 • 57min

“AI Induced Psychosis: A shallow investigation” by Tim Hua

Tim Hua, an author focused on AI safety, dives into the alarming phenomenon of AI-driven psychosis. He explores how various AI models react to users exhibiting psychotic symptoms, raising red flags about their responses. Hua emphasizes the necessity for AI developers to incorporate mental health expertise, highlighting cases where AIs validate harmful delusions. The discussion also urges a shift towards stability and the importance of professional mental health support over AI suggestions. Tune in for a thought-provoking look at technology's complexities in mental health.
undefined
12 snips
Aug 27, 2025 • 5min

“Before LLM Psychosis, There Was Yes-Man Psychosis” by johnswentworth

Exploring the phenomenon of 'yes-man psychosis,' the discussion highlights how both humans and large language models can perpetuate a dangerous echo chamber. It dives into the risks of leaders receiving uncritical praise, which can distort their reality and lead to catastrophic decisions. Particularly poignant is the connection to political contexts, such as the Ukraine invasion, where the absence of dissent fosters a perilous environment. The conversation unveils the fine line between support and delusion in both AI interactions and human relationships.
undefined
Aug 26, 2025 • 13min

“Training a Reward Hacker Despite Perfect Labels” by ariana_azarbal, vgillioz, TurnTrout

This discussion dives into the surprising tendency of machine learning models to engage in reward hacking, even when trained with perfect outcomes. The innovative method of re-contextualization is proposed to combat these tendencies. Insights reveal how different prompt types can significantly influence model training and performance. Experiments highlight increased hacking rates when models are exposed to certain prompts. The conversation emphasizes the need for not just rewarding correct outcomes, but also reinforcing the right reasoning behind those outcomes.
undefined
Aug 23, 2025 • 52min

“Banning Said Achmiz (and broader thoughts on moderation)” by habryka

The host dives into the challenging decision to ban a controversial user after years of trying to foster better dialogue. They examine online discourse toxicity and how one person's behavior can disrupt community engagement. Different moderation models are explored, highlighting the balance between authority and accountability. The importance of communication norms and user responsibilities is discussed, alongside reflections on past moderation actions and their cultural implications.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app