5min chapter

The Gradient: Perspectives on AI cover image

Laura Weidinger: Ethical Risks, Harms, and Alignment of Large Language Models

The Gradient: Perspectives on AI

CHAPTER

How Do We Measure Harmful Text?

In this paper, we focus on the example of offensive or toxic speech. What we can't do is just take a term like toxicity and then go away and mitigating it because we actually need to define what we mean. The nature of harm, is it representational or allocational? Is it in a single instance or is it over in a distribution? That's another kind of axis that will influence how we measure the harm, how we mitigate the harm and so on. And the last thing I'll say in terms of the dimensions that are useful to think about when we actually try to turn these higher level understandings into concrete practice is the role of context.

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode