

"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.
Oct 3, 2023
Jan Brauner, an AI researcher, discusses the development of a simple lie detector for language models. The lie detector uses unrelated follow-up questions and logistic regression. It is highly accurate and generalizes across different models and contexts. This indicates distinctive lie-related patterns in language models.
Chapters
Transcript
Episode notes