
OpenObservability Talks OpenTelemetry for GenAI and the OpenLLMetry project - OpenObservability Talks S6E06
Nov 26, 2025
Nir Gazit, CEO of TraceLoop and creator of the OpenLLMetry project, dives deep into the realm of AI observability. He explores how traditional metrics fall short in the face of AI’s unique challenges like hallucinations. The discussion covers innovative semantic conventions for monitoring generative AI and tracing prompt responses to enhance explainability. Nir also shares insights on logging practices, the evolution of OpenLLMetry tools, and the vision behind integrating AI observability into the popular OpenTelemetry framework.
AI Snips
Chapters
Transcript
Episode notes
AI Demands New Quality Signals
- AI observability needs both classic signals (latency, errors) and new quality metrics like hallucination rate and jailbreak detection.
- Nir Gazit explains many AI errors are semantic and invisible to metadata-only monitoring.
Use LLMs To Judge LLM Outputs
- Use another LLM to judge responses and produce quality metrics rather than relying solely on human labeling.
- Apply LLM-as-judge carefully to balance accuracy and cost in production monitoring.
Log Prompt Content For Explainability
- Prompt and completion content must be logged for explainability and regulatory compliance.
- Nir warns that OpenTelemetry and backends weren't designed for very large attributes like prompts and images.
