LessWrong (Curated & Popular)

LessWrong

Audio narrations of LessWrong posts. Includes all curated posts and all posts with 125+ karma.If you'd like more, subscribe to the “Lesswrong (30+ karma)” feed.

Episodes

Mentioned books

Oct 30, 2023 • 18min

[HUMAN VOICE] "Alignment Implications of LLM Successes: a Debate in One Act" by Zack M Davis

The podcast explores the challenges of aligning AI with human values and the concept of corrigible AI. It discusses the potential and limitations of Language Model Agents (LLMs) and the repetition trap phenomenon. A debate ensues about the implications of AI alignment challenges and the risks of misgeneralized obedience in AI. Overall, it delves into the complex and evolving field of AI alignment.

Oct 23, 2023 • 33min

"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.

Simon Lermen and Jeffrey Ladish discuss LoRA fine-tuning and its impact on safety training. They explore the effectiveness of safety procedures, the QloRA technique, dark topics of slurs and brutal killings, effects of model size on harmful task performance, a hypothetical plan for AI attack and control, and the analysis of refusals and comparison of instruction sets.

Oct 23, 2023 • 50min

"Holly Elmore and Rob Miles dialogue on AI Safety Advocacy" by jacobjacob, Robert Miles & Holly_Elmore

Holly Elmore, organizer of AI Pause protests, and Rob Miles, AI Safety YouTuber, explore the effectiveness of protests, the role of activists in technological advancements, the misconception of technical work, the clash between advocacy and truth-seeking, the importance of rationality, and the significance of advocacy in AI safety.

Oct 19, 2023 • 3min

"Labs should be explicit about why they are building AGI" by Peter Barnett

The podcast discusses the importance of transparency in AI labs building AGI, stressing the need to communicate risks to the public and policy makers.

Oct 18, 2023 • 21min

[HUMAN VOICE] "Sum-threshold attacks" by TsviBT

The podcast discusses sum-threshold attacks and the importance of coordinated arguments. It explores adversarial image attacks and how small changes can deceive AI classifiers. The concept of optimization channels and the notion of a vector space representing noticeable features are also explored.

Oct 18, 2023 • 17min

"Will no one rid me of this turbulent pest?" by Metacelsus

This podcast discusses the potential of gene drives to end malaria and the need for deployment to save lives. It explores the technical details of gene drive construction, ordering DNA, setting up mosquito breeding facilities, implementing gene drives to combat malaria, and overcoming political barriers in Africa.

Oct 15, 2023 • 12min

"RSPs are pauses done right" by evhub

This podcast explores the importance of Responsible Scaling Policies (RSBs) in preventing AI existential risk and emphasizes the need for public support. It discusses the concepts of capabilities evaluation, safety evaluation, and the role of RSP commitments in ensuring AI safety. The significance of mechanistic interpretability and leveraging influence in AI models is also explored. The effectiveness of a labs-first approach in progressing AI technology and the importance of RSBs are discussed. The podcast advocates for robust safety precautions (RSPs) in AI development, highlighting their concrete and actionable nature compared to advocating for a pause in development.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app