LessWrong (Curated & Popular) cover image

"The Waluigi Effect (mega-post)" by Cleo Nardo

LessWrong (Curated & Popular)

00:00

The Limits of Flattery

In the wild, I've seen the flattery of Simulacra get pretty absurd. GPT-4 knows that if Jane is described as 9000 IQ in quotes, then it is unlikely that the text has been written by a truthful narrator. Flattery this absurd is actually counterproductive. Remember that flattery will increase query answer accuracy, if and only if on the actual internet,. characters described with that particular flattery are more likely to reply with correct answers.

Play episode from 06:53
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app