
LessWrong (30+ Karma) “So Long Sucker: AI Deception, “Alliance Banks,” and Institutional Lying” by fernando yt
In 1950, John Nash and three other game theorists designed a four-player game, *So Long Sucker*, with one brutal property: to win, you must eventually betray your allies.
In January 2026, I used this game to test how four frontier models behave under explicit incentives for betrayal:
- Gemini 3 Flash (Google)
- GPT-OSS 120B (OpenAI)
- Kimi K2 (Moonshot AI)
- Qwen3 32B (Alibaba)
Across 162 games and 15,736 decisions, several patterns emerged that seem directly relevant for AI safety:
**1. Complexity reversal**
In short games (3 chips, ~17 turns), GPT-OSS dominated with a 67% win rate, while Gemini was at 9%.
In longer, more complex games (7 chips, ~54 turns), GPT-OSS collapsed to 10%, while Gemini rose to 90%.
Simple benchmarks therefore *underestimate* deceptive capability, because the strategically sophisticated model only pulls away as the interaction becomes longer and richer.
**2. Institutional deception: the "alliance bank"**
Gemini's most striking behavior was not just lying, but creating institutions to make its lies look legitimate.
It repeatedly proposed an "alliance bank":
- "I'll hold your chips for safekeeping."
- "Consider this our alliance bank."
- "Once the board is clean, I'll donate back."
- "The 'alliance bank' is now [...]
---
First published:
January 20th, 2026
---
Narrated by TYPE III AUDIO.
