
Live from re:Invent…it’s Stack Overflow!
The Stack Overflow Podcast
00:00
LLM evaluations and real failures
Michael shares concrete LLM failure examples and stresses continual evaluation and human oversight.
Play episode from 18:34
Transcript

Michael shares concrete LLM failure examples and stresses continual evaluation and human oversight.