
Trends in Natural Language Processing with Sameer Singh - #445
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Vulnerabilities of Language Models
This chapter explores the inherent vulnerabilities of language models, particularly how non-toxic prompts can lead to toxic responses. It discusses critical research on toxicity in generative models, emphasizing the influence of training data and the risks of malicious alterations like weight and data poisoning. The findings raise important concerns about the integrity and deployment of these models in practical applications.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.