Practical AI cover image

Representation Engineering (Activation Hacking)

Practical AI

00:00

Modifying Model Decoding Output for Control in Generative Models

Controlling generative models can be achieved through prompting strategies or modifying the model's decoding output. By altering how the model decodes its output, such as restricting it to specific types like binary outputs, users can have more control over the model's results without directly modifying the model's weights and biases. This method involves applying a control vector to the hidden states within the model, changing how the forward pass of the model operates. By using this approach, users can guide the model's output towards specific types of results, allowing for more controlled generation without affecting the model's core architecture.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app