
Henry Cai
The author of the paper on self-control of LLM behaviors.
Best podcasts with Henry Cai
Ranked by the Snipd community

Jun 16, 2024 • 16min
AF - Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller by Henry Cai
Henry Cai, author of a paper on self-controlling LLM behaviors, discusses using suffix gradients to modify model behaviors effectively. Topics range from exploring dinosaur noises, resisting petting a cat, and reasoning exercises to improving self-control by compressing suffix gradients into a prefix controller for LLMs, emphasizing representation engineering and gradient control.


