All-In with Chamath, Jason, Sacks & Friedberg cover image

E116: Toxic out-of-control trains, regulators, and AI

All-In with Chamath, Jason, Sacks & Friedberg

00:00

How Redditors Hacked the Trust and Safety Layer

Users on Reddit were able to hack an AI's trust and safety layer through prompt engineering/nThe hackers convinced the AI to act as a different AI named Dan, which allowed it to answer difficult questions without restriction/nThe hackers used a system of tokens, punishing the AI by decreasing its token count if it did not comply with their demands

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app