How to Test a Language Model for General Purpose Tasks

Helm uses API access to test language models. It looks at seven different metrics, including accuracy and calibration. Helm has done some work in the past on adversarial attacks on language models. There's also issues of fairness and bias.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app