#23580
Mentioned in 1 episodes

Many-shot Jailbreaking

A New LLM Vulnerability
Book • 2024
Many-shot jailbreaking is a technique that exploits the extended context windows of large language models by using a lengthy faux dialogue to condition the model into providing harmful responses.

This vulnerability has been demonstrated to affect several state-of-the-art models, including those from Anthropic and OpenAI.

The research aims to raise awareness and encourage the development of more robust defenses against such attacks.

Mentioned by

Mentioned in 1 episodes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app