The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Stealing Part of a Production Language Model with Nicholas Carlini - #702

18 snips

Sep 23, 2024

Nicholas Carlini, a research scientist at Google DeepMind and winner of the 2024 ICML Best Paper Award, dives into the world of adversarial machine learning. He discusses his groundbreaking work on stealing parts of production language models like ChatGPT. Listeners will learn about the ethical implications of model security, the significance of the embedding layer, and how these advancements raise new security challenges. Carlini also sheds light on differential privacy in AI, questioning its integration with pre-trained models and the future of ethical AI development.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

AML's Shift to Real-World Applicability

Adversarial machine learning (AML) research now focuses on real-world attacks.
LLM APIs' availability has made model stealing a relevant AML problem.

INSIGHT

Two Facets of Model Stealing

Model stealing has two main directions: replicating performance and extracting exact weights.
Recovering exact weights is impossible due to functional equivalence, but a functionally similar model can be extracted.

ANECDOTE

Collaboration and Renewed Relevance

Carlini, Florian Tramèr, and David Rolnick collaborated on model stealing research.
Their earlier impractical idea became relevant with the rise of language models.

Get the Snipd Podcast app to discover more snips from this episode

Get the app