

Stealing Part of a Production Language Model with Nicholas Carlini - #702
18 snips Sep 23, 2024
Nicholas Carlini, a research scientist at Google DeepMind and winner of the 2024 ICML Best Paper Award, dives into the world of adversarial machine learning. He discusses his groundbreaking work on stealing parts of production language models like ChatGPT. Listeners will learn about the ethical implications of model security, the significance of the embedding layer, and how these advancements raise new security challenges. Carlini also sheds light on differential privacy in AI, questioning its integration with pre-trained models and the future of ethical AI development.
AI Snips
Chapters
Transcript
Episode notes
AML's Shift to Real-World Applicability
- Adversarial machine learning (AML) research now focuses on real-world attacks.
- LLM APIs' availability has made model stealing a relevant AML problem.
Two Facets of Model Stealing
- Model stealing has two main directions: replicating performance and extracting exact weights.
- Recovering exact weights is impossible due to functional equivalence, but a functionally similar model can be extracted.
Collaboration and Renewed Relevance
- Carlini, Florian Tramèr, and David Rolnick collaborated on model stealing research.
- Their earlier impractical idea became relevant with the rise of language models.