Manipulating AI Behavior Through Triggers

The chapter discusses a recent paper on inserting triggers in AI systems, demonstrating the ability to prompt specific responses like backdoors or negative language. It highlights the risks of hidden triggers in AI models and explores the influence of external factors, as well as delves into the concept of career capital and mind alteration in the field.

Play episode from 05:27

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app