LessWrong (Curated & Popular)

“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans

Jun 23, 2024

Researcher Johannes Treutlein and ML expert Owain Evans discuss LLMs' ability to infer latent information for tasks like defining functions and predicting city names without in-context learning. They showcase how LLMs can carry out tasks by leveraging training data without explicit reasoning.

17:56

Episode guests

Owain Evans

Johannes Treutlein

AI Summary

AI Chapters

Episode notes

Podcast summary created with Snipd AI

Quick takeaways

LLMs can infer latent information from training data for downstream tasks without in-context learning, demonstrating out-of-context reasoning capabilities.

Inductive out-of-context reasoning in LLMs raises AI safety concerns due to unmonitored acquisition of sensitive information and potential risks of deception.

Deep dives

Inductive Out-of-Context Reasoning in LLMs

LLMs can infer latent information from training data and utilize it for downstream tasks without in-context learning. Experimental results show that LLMs, fine-tuned on specific data like distances between cities, can deduce latent information such as the identity of unknown cities like Paris. Although effective in some cases, inductive out-of-context reasoning (OOCR) is shown to be unreliable, particularly with smaller LLMs tackling complex structures.

Exploration of LLMs' Inductive Out-of-Context Reasoning Abilities

18min

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a link post.TL;DR: We published a new paper on out-of-context reasoning in LLMs. We show that LLMs can infer latent information from training data and use this information for downstream tasks, without any in-context learning or CoT. For instance, we finetune GPT-3.5 on pairs (x,f(x)) for some unknown function f. We find that the LLM can (a) define f in Python, (b) invert f, (c) compose f with other functions, for simple functions such as x+14, x // 3, 1.75x, and 3x+2.

Paper authors: Johannes Treutlein*, Dami Choi*, Jan Betley, Sam Marks, Cem Anil, Roger Grosse, Owain Evans (*equal contribution)

Johannes, Dami, and Jan did this project as part of an Astra Fellowship with Owain Evans.

Below, we include the Abstract and Introduction from the paper, followed by some additional discussion of our AI safety [...]

---

First published:
June 21st, 2024

Source:
https://www.lesswrong.com/posts/5SKRHQEFr8wYQHYkx/connecting-the-dots-llms-can-infer-and-verbalize-latent

---

Narrated by TYPE III AUDIO.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.

Get the app

LessWrong (Curated & Popular)

“Connecting the Dots: LLMs can Infer & Verbalize Latent Structure from Training Data” by Johannes Treutlein, Owain_Evans

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

Deep dives

Inductive Out-of-Context Reasoning in LLMs

AI Safety Implications

Relevance and Mechanisms of Inductive OOCR

Remember Everything You Learn from Podcasts