The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

Trends in Natural Language Processing with Sameer Singh - #445

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

CHAPTER

Vulnerabilities of Language Models

This chapter explores the inherent vulnerabilities of language models, particularly how non-toxic prompts can lead to toxic responses. It discusses critical research on toxicity in generative models, emphasizing the influence of training data and the risks of malicious alterations like weight and data poisoning. The findings raise important concerns about the integrity and deployment of these models in practical applications.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner