
20 - 'Reform' AI Alignment with Scott Aaronson
AXRP - the AI X-risk Research Podcast
00:00
How to Defend Against AI Alignment Attacks
Google Translate can be used to translate words from one language to another. Google has developed a system called GPT, which is able to do the same thing in different languages. The software uses watermarking and other techniques to protect against this kind of attack. But it's not clear if there are any rules that would prevent such attacks.
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.