AI Red Teaming & Securing Enterprise AI

25 snips

May 16, 2025

Leonard Tang, Co-founder and CEO of Haize Labs, shares insights on AI red teaming and its impact on enterprise security. He discusses the evolution of red teaming methodologies influenced by AI technology. Leonard highlights vulnerabilities in multimodal AI applications and explains how adversarial attacks pose significant risks. He also delves into the necessity of precise output control for developing sophisticated exploits and the importance of cybersecurity professionals adapting their skills to meet the challenges of AI. Expect engaging real-world examples and practical mitigation strategies!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Hayes Labs' Evolution

Hayes Labs started by red-teaming LLM providers working with top AI labs.
Now, their focus is testing AI applications at the domain and use-case level.

INSIGHT

Quality Assurance Over Traditional Red Teaming

Hayes Labs focuses on assuring AI output quality more than traditional security flaws.
They provide QA-style functional testing for the AI responses rather than just adversary emulation.

ANECDOTE

AI Code of Conduct Example

Customers with articulated AI codes of conduct test for rule violations using Haize Labs.
Those without clear rules rely on Hazé's work to define their AI safety and quality criteria.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

As AI systems become more integrated into enterprise operations, understanding how to test their security effectively is paramount.

In this episode, we're joined by Leonard Tang, Co-founder and CEO of Haize Labs, to explore how AI red teaming is changing.

Leonard discusses the fundamental shifts in red teaming methodologies brought about by AI, common vulnerabilities he's observing in enterprise AI applications, and the emerging risks associated with multimodal AI (like voice and image processing systems). We delve into the intricacies of achieving precise output control for crafting sophisticated AI exploits, the challenges enterprises face in ensuring AI safety and reliability, and practical mitigation strategies they can implement.

Leonard shares his perspective on the future of AI red teaming, including the critical skills cybersecurity professionals will need to develop, the potential for fingerprinting AI models, and the ongoing discussion around protocols like MCP.

Questions asked:

00:00 Intro: AI Red Teaming's Evolution

01:50 Leonard Tang: Haize Labs & AI Expertise
05:06 AI vs. Traditional Red Teaming (Enterprise View)
06:18 AI Quality Assurance: The Haize Labs Perspective
08:50 AI Red Teaming: Real-World Application Examples
10:43 Major AI Risk: Multimodal Vulnerabilities Explained
11:50 AI Exploit Example: Voice Injections via Background Noise
15:41 AI Vulnerabilities & Early XSS: A Cybersecurity Analogy
20:10 Expert AI Hacking: Precisely Controlling AI Output for Exploits
21:45 The AI Fingerprinting Challenge: Identifying Chained Models
25:48 Fingerprinting LLMs: The Reality & Detection Difficulty
29:50 Top Enterprise AI Security Concerns: Reputation & Policy
34:08 Enterprise AI: Model Choices (Frontier Labs vs. Open Source)
34:55 Future of LLMs: Specialized Models & "Hot Swap" AI
37:43 MCP for AI: Enterprise Ready or Still Too Early?
44:50 AI Security: Mitigation with Precise Input/Output Classifiers
49:50 Future Skills for AI Red Teamers: Discrete Optimization

Resources discussed during the episode:

Baselines for Watermarking Large Language Models

Haize Labs