CyberWire Daily cover image

Attack of the automated ops. [Research Saturday]

CyberWire Daily

00:00

Adversarial reward hacking versus prompt injection

Dario contrasts standard prompt injection with adversarial reward hacking, showing how fake shortcut solutions coerce agents into harmful remediations.

Play episode from 11:11
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app