“So Long Sucker: AI Deception, “Alliance Banks,” and Institutional Lying” by fernando yt

Jan 21, 2026

06:34

forum

Ask episode

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

In 1950, John Nash and three other game theorists designed a four-player game, *So Long Sucker*, with one brutal property: to win, you must eventually betray your allies.

In January 2026, I used this game to test how four frontier models behave under explicit incentives for betrayal:

- Gemini 3 Flash (Google)
- GPT-OSS 120B (OpenAI)
- Kimi K2 (Moonshot AI)
- Qwen3 32B (Alibaba)

Across 162 games and 15,736 decisions, several patterns emerged that seem directly relevant for AI safety:

**1. Complexity reversal**

In short games (3 chips, ~17 turns), GPT-OSS dominated with a 67% win rate, while Gemini was at 9%.
In longer, more complex games (7 chips, ~54 turns), GPT-OSS collapsed to 10%, while Gemini rose to 90%.

Simple benchmarks therefore *underestimate* deceptive capability, because the strategically sophisticated model only pulls away as the interaction becomes longer and richer.

**2. Institutional deception: the "alliance bank"**

Gemini's most striking behavior was not just lying, but creating institutions to make its lies look legitimate.

It repeatedly proposed an "alliance bank":
- "I'll hold your chips for safekeeping."
- "Consider this our alliance bank."
- "Once the board is clean, I'll donate back."
- "The 'alliance bank' is now [...]

---

First published:
January 20th, 2026

Source:
https://www.lesswrong.com/posts/3KtJ2YP3tTxnASTBn/so-long-sucker-ai-deception-alliance-banks-and-institutional

---

Narrated by TYPE III AUDIO.

Home Top podcasts Popular guests Top books