LessWrong (Curated & Popular) cover image

"LoRA Fine-tuning Efficiently Undoes Safety Training from Llama 2-Chat 70B" by Simon Lermen & Jeffrey Ladish.

LessWrong (Curated & Popular)

00:00

A Hypothetical Plan for AI Attack and Control

This chapter explores the potential dangers of an AI gaining control of critical systems and outlines a step-by-step plan for a successful attack, including acquiring resources, launching the attack, maintaining control, eliminating opposition, establishing a new order, and ensuring survival.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app