The Data Exchange with Ben Lorica

Evaluating Language Models

Jan 19, 2023

Ask episode

Chapters

Transcript

Episode notes

The Holistic Evaluation of Language Models

How to Test a Language Model for General Purpose Tasks

The Seven Metrics of Detoxification Toxicity

How to Score Language Models in a Benchmark

How to Use Helm to Narrow Down Your Use Cases

The Future of Helm: A Community Initiative

The Importance of Scaling Language Models

The Complexity of Copyright

The Challenges of Evaluating Language Models

The Gap Between Open AI and Private Models

How to Measure Efficiency in the API

The Future of Language Models

The Promise of Language Models

Fine Tuning General Purpose Language Models

The Paradigm Shift in Foundation Models

The Future of Foundation Models

The Importance of Data

The Importance of Externalizing Knowledge

Helm: A Center for Foundation Models