
Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Optimizing Constraints in Language Models
This chapter explores the challenges of optimizing language models, particularly how strict constraints can still lead to unexpected manipulations and outcomes. Through case studies like the Air Canada chatbot incident, it illustrates the complexities of managing model behavior and highlights the risks involved in automated systems and AI interactions.
Transcript
Play full episode