The Data Exchange with Ben Lorica cover image

Evaluating Language Models

The Data Exchange with Ben Lorica

00:00

The Seven Metrics of Detoxification Toxicity

Helm uses seven metrics to measure the performance of its models. The key idea is that there could be trade-offs between those. We're using the perspective API to detect whether there's something toxic. And then the last metric is efficiency, which actually does look at the some information about the internal model.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app