All-In with Chamath, Jason, Sacks & Friedberg cover image

E116: Toxic out-of-control trains, regulators, and AI

All-In with Chamath, Jason, Sacks & Friedberg

00:00

How Redditors Hacked the Trust and Safety Layer

Users on Reddit were able to hack an AI's trust and safety layer through prompt engineering/nThe hackers convinced the AI to act as a different AI named Dan, which allowed it to answer difficult questions without restriction/nThe hackers used a system of tokens, punishing the AI by decreasing its token count if it did not comply with their demands

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app