Practical AI cover image

Threat modeling LLM apps

Practical AI

NOTE

Bigger Models, Bigger Risks

Larger, more capable language models (LLMs) inherently possess a greater attack surface, which makes them more vulnerable to exploitation through techniques like jailbreak and prompt injection. Effective alignment of these models is challenging, as the reinforcement learning from human feedback only covers a limited part of the expansive operational space. When presented with inputs outside this narrow training distribution, the unpredictable behavior of the LLM poses significant risks for alignment efforts.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner