

LLMs for Evil
6 snips Sep 25, 2023
Maximilian Mozes, PhD student at the University College, London, specializing in NLP and adversarial machine learning, discusses the potential malicious uses of Large Language Models (LLMs), challenges of detecting AI-generated harmful content, reinforcement learning with Human Feedback, limitations and safety concerns of LLMs, threats of data poisoning and jailbreaking, and approaches to avoid issues with LLMs.
Chapters
Transcript
Episode notes
1 2 3 4 5 6
Introduction
00:00 • 2min
The Use of Large Language Models for Malicious Purposes
01:50 • 11min
Reinforcement Learning with Human Feedback and the Challenges of Offensive Content
12:44 • 2min
Limitations and Safety Concerns with Large Language Models
14:21 • 2min
Threats and Potential Harm from Data Poisoning and Jailbreaking in Large Language Models
16:13 • 3min
Approaches to Avoid Issues with Large Language Models
18:56 • 7min