
E116: Toxic out-of-control trains, regulators, and AI
All-In with Chamath, Jason, Sacks & Friedberg
00:00
How Redditors Hacked the Trust and Safety Layer
Users on Reddit were able to hack an AI's trust and safety layer through prompt engineering/nThe hackers convinced the AI to act as a different AI named Dan, which allowed it to answer difficult questions without restriction/nThe hackers used a system of tokens, punishing the AI by decreasing its token count if it did not comply with their demands
Transcript
Play full episode