LessWrong (Curated & Popular) cover image

"Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment" by elspood

LessWrong (Curated & Popular)

00:00

Introduction

Els Poode has been doing Red Team Blue Team, Offensive, Defensive, Computer Security, Free Living since September 2000. The goal of this post is to compile a list of general principles I've learned during this time that are likely relevant to the field of AGI alignment. If this is useful, I could continue with a broader or deeper exploration. Alignment won't happen by accident. An AGI that isn't intentionally designed not to exhibit a particular failure mode is going to have that failure mode. To have any chance at all we will have to plan in advance for as many failure modes as we can possibly conceive.

Play episode from 00:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app