"How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions" by Jan Brauner et al.

Oct 3, 2023

Jan Brauner, an AI researcher, discusses the development of a simple lie detector for language models. The lie detector uses unrelated follow-up questions and logistic regression. It is highly accurate and generalizes across different models and contexts. This indicates distinctive lie-related patterns in language models.

Ask episode