LLMs for Evil

6 snips

Sep 25, 2023

Maximilian Mozes, PhD student at the University College, London, specializing in NLP and adversarial machine learning, discusses the potential malicious uses of Large Language Models (LLMs), challenges of detecting AI-generated harmful content, reinforcement learning with Human Feedback, limitations and safety concerns of LLMs, threats of data poisoning and jailbreaking, and approaches to avoid issues with LLMs.

Ask episode

Chapters

Transcript

Episode notes

Introduction

00:00 • 2min

The Use of Large Language Models for Malicious Purposes

01:50 • 11min

Reinforcement Learning with Human Feedback and the Challenges of Offensive Content

12:44 • 2min

Limitations and Safety Concerns with Large Language Models

14:21 • 2min

Threats and Potential Harm from Data Poisoning and Jailbreaking in Large Language Models

16:13 • 3min

Approaches to Avoid Issues with Large Language Models

18:56 • 7min