LessWrong (Curated & Popular) cover image

"SolidGoldMagikarp (plus, prompt generation)"

LessWrong (Curated & Popular)

00:00

Optimized Inputs to Optimized Outputs

GPT-2XL is a complex flowchart. It has an input which is labeled optimize input and an output which is labeled to maximized output logic for target class. To the right of this diagram we have some examples of what kind of optimized inputs result from this process. For girl we have a mix of nonsense and real words. We're not optimizing for realistic inputs, but rather for inputs that maximize the output probability of the target completion shown in bold above. That is the words girl, woman, good and doctor in the four examples we just heard.

Play episode from 05:12
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app