AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
New Automated Adversarial Attacks on Language Models
New research has uncovered automated, effective attacks on large language models like chat GPT, bard, and clawed. These attacks involve constructing specific character sequences that, when added to a user query, can trick the system into following harmful commands. Unlike previous jailbreak attempts, these attacks are automated and potentially unpatchable by LLM providers.