

#84 LAURA RUIS - Large language models are not zero-shot communicators [NEURIPS UNPLUGGED]
Dec 6, 2022
In this insightful discussion, Laura Ruis, a researcher focused on pragmatic inferences in conversational AI, delves into the limitations of large language models. She reveals how these models struggle with context and implicature, causing misunderstandings in communication. Ruis also examines zero-shot learning capabilities, showcasing disparities in performance across different models. Additionally, she highlights the importance of human feedback in refining these AI systems, aiming for a future where they can more effectively interpret and engage in nuanced conversations.
AI Snips
Chapters
Transcript
Episode notes
ChatGPT Fails Implicature Test
- Esther asked Juan if he could come to her party, and he responded that he had to work.
- ChatGPT failed to infer that Juan's response implied he couldn't attend the party, highlighting its implicature limitations.
Implicature Remains a Challenge
- Laura Ruis found that large language models, even with in-context learning, struggle with implicature compared to humans.
- While ChatGPT shows promise, systematic testing reveals a performance gap, particularly with nuanced or context-dependent implicatures.
LLMs Struggle with Implicature
- Large language models struggle with implicature, a key aspect of communication involving interpreting language using shared knowledge.
- Base models like OPT and Bloom perform poorly, while instructable models like Flan-T5 and DaVinci show more promise but still lag behind humans.