AXRP - the AI X-risk Research Podcast cover image

20 - 'Reform' AI Alignment with Scott Aaronson

AXRP - the AI X-risk Research Podcast

00:00

How to Defend Against AI Alignment Attacks

Google Translate can be used to translate words from one language to another. Google has developed a system called GPT, which is able to do the same thing in different languages. The software uses watermarking and other techniques to protect against this kind of attack. But it's not clear if there are any rules that would prevent such attacks.

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner