What’s the Magic Word? A Control Theory of LLM Prompting.

122 snips

Jun 5, 2024

Guest

Aman Bhargava

Guest

Cameron Witkowski

Aman Bhargava, a PhD student at Caltech, and Cameron Witkowski, a graduate student at the University of Toronto, dive into their groundbreaking research on controlling language models using control theory. They discuss how language models operate as discrete systems and the surprising impact of prompt engineering on outputs. By examining the "reachable set" of outputs, they reveal that even minor tweaks in prompts can lead to significant changes in generated text. Their insights could pave the way for more reliable and capable AI systems.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

INSIGHT

Control Theory Perspective

Framing LLMs as discrete stochastic dynamical systems allows for a structured way to analyze their output behaviors. This perspective enhances understanding of controllability and the reachability of outputs.

ADVICE

Explore Control Inputs

Explore how different prompts influence the output of LLMs. Experiment with varying lengths and types of control inputs to discover the optimal ways to steer the models.

INSIGHT

Adversarial Properties of LLMs

Like other systems, LLMs exhibit adversarial properties, showing that small input changes can influence their responses. The chaotic nature of some prompts reveals underlying complexities within LLM behavior.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

These two scientists have mapped out the insides or “reachable space” of a language model using control theory, what they discovered was extremely surprising.

Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.

https://patreon.com/mlst

YT version: https://youtu.be/Bpgloy1dDn0

Aman Bhargava from Caltech and Cameron Witkowski from the University of Toronto to discuss their groundbreaking paper, “What’s the Magic Word? A Control Theory of LLM Prompting.” (the main theorem on self-attention controllability was developed in collaboration with Dr. Shi-Zhuo Looi from Caltech).

They frame LLM systems as discrete stochastic dynamical systems. This means they look at LLMs in a structured way, similar to how we analyze control systems in engineering. They explore the “reachable set” of outputs for an LLM. Essentially, this is the range of possible outputs the model can generate from a given starting point when influenced by different prompts. The research highlights that prompt engineering, or optimizing the input tokens, can significantly influence LLM outputs. They show that even short prompts can drastically alter the likelihood of specific outputs. Aman and Cameron’s work might be a boon for understanding and improving LLMs. They suggest that a deeper exploration of control theory concepts could lead to more reliable and capable language models.

We dropped an additional, more technical video on the research on our Twitter account here: https://x.com/MLStreetTalk/status/1795093759471890606

Additional 20 minutes of unreleased footage on our Patreon here: https://www.patreon.com/posts/whats-magic-word-104922629

What's the Magic Word? A Control Theory of LLM Prompting (Aman Bhargava, Cameron Witkowski, Manav Shah, Matt Thomson)

https://arxiv.org/abs/2310.04444

LLM Control Theory Seminar (April 2024)

https://www.youtube.com/watch?v=9QtS9sVBFM0

Society for the pursuit of AGI (Cameron founded it)

https://agisociety.mydurable.com/

Roger Federer demo

http://conway.languagegame.io/inference

Neural Cellular Automata, Active Inference, and the Mystery of Biological Computation (Aman)

https://aman-bhargava.com/ai/neuro/neuromorphic/2024/03/25/nca-do-active-inference.html

Aman and Cameron also want to thank Dr. Shi-Zhuo Looi and Prof. Matt Thomson from from Caltech for help and advice on their research. (https://thomsonlab.caltech.edu/ and https://pma.caltech.edu/people/looi-shi-zhuo)

https://x.com/ABhargava2000

https://x.com/witkowski_cam