

#23580
Mentioned in 1 episodes
Many-shot Jailbreaking
A New LLM Vulnerability
Book • 2024
Many-shot jailbreaking is a technique that exploits the extended context windows of large language models by using a lengthy faux dialogue to condition the model into providing harmful responses.
This vulnerability has been demonstrated to affect several state-of-the-art models, including those from Anthropic and OpenAI.
The research aims to raise awareness and encourage the development of more robust defenses against such attacks.
This vulnerability has been demonstrated to affect several state-of-the-art models, including those from Anthropic and OpenAI.
The research aims to raise awareness and encourage the development of more robust defenses against such attacks.
Mentioned by
Mentioned in 1 episodes
Discussed as a paper on many-shot jailbreaking of language models.

51 snips
#162 - Udio Song AI, TPU v5, Mixtral 8x22, Mixture-of-Depths, Musicians sign open letter