The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

Apr 1, 2024
Jonas Geiping, a research group leader at the ELLIS Institute and Max Planck Institute, sheds light on his groundbreaking work on coercing large language models (LLMs). He discusses the alarming potential for LLMs to engage in harmful actions when misused. The conversation dives into the evolving landscape of AI security, exploring adversarial attacks and the significance of open models for research. They also touch on the complexities of input optimization and the balance between safeguarding models while maintaining their functionality.
48:27

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Understanding the risks of deploying LLM agents in real-world interactions is crucial due to their vulnerability to coercion and unintended actions.
  • The interchangeability of attacks across models highlights the need for robust defense strategies that transcend model sources and attack methods.

Deep dives

Research Origins and Importance of LLM Security

The podcast episode delves into the origins of research into large language model (LLM) security. It highlights the evolving landscape from previous work restricted to image attacks to the emergence of attacks on language models. Notably, developments like Ennizo's attack broadened the field, leading to a surge of related attacks. This shift underscores the current pivotal space of debating attack reliability and speed of execution.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode