Measuring Bias, Toxicity, and Truthfulness in LLMs With Python

20 snips

Jan 19, 2024

Jodie Burchell, developer advocate for data science at JetBrains, discusses techniques and tools for evaluating large language models (LLMs) using Python. They explore measuring bias, toxicity, and truthfulness in LLMs, the challenges and limitations of AI language models, the role of Python packages like Hugging Face, and the concept of grouping and acronyms. Jodie also shares benchmarking datasets and resources available on Hugging Face for evaluating LLMs.