Audio note: this article contains 86 uses of latex notation, so the narration may be difficult to follow. There's a link to the original text in the episode description.
Around two months ago, John and I published Resampling Conserves Redundancy (Approximately). Fortunately, about two weeks ago, Jeremy Gillen and Alfred Harwood showed us that we were wrong.
This proof achieves, using the Jensen-Shannon divergence ("JS"), what the previous one failed to show using KL divergence ("<span>_D_{KL}_</span>"). In fact, while the previous attempt tried to show only that redundancy is conserved (in terms of <span>_D_{KL}_</span>) upon resampling latents, this proof shows that the redundancy and mediation conditions are conserved (in terms of JS).
Why Jensen-Shannon?
In just about all of our previous work, we have used <span>_D_{KL}_</span> as our factorization error. (The error meant to capture the extent to which a given distribution fails to factor according to some graphical structure.) In this post I use the Jensen Shannon divergence.
<span>_D_{KL}(U||V) := mathbb{E}_{U}lnfrac{U}{V}_</span>
<span>_JS(U||V) := frac{1}{2}D_{KL}left(U||frac{U+V}{2}right) + frac{1}{2}D_{KL}left(V||frac{U+V}{2}right)_</span>
The KL divergence is a pretty fundamental quantity in information theory, and is used all over the place. (JS is usually defined in terms of <span>_D_{KL}_</span>, as above.) We [...]
---
Outline:
(01:04) Why Jensen-Shannon?
(03:04) Definitions
(05:33) Theorem
(06:29) Proof
(06:32) (1) _\\epsilon^{\\Gamma}_1 = 0_
(06:37) Proof of (1)
(06:52) (2) _\\epsilon^{\\Gamma}_2 \\leq (2\\sqrt{\\epsilon_1}+\\sqrt{\\epsilon_2})^2_
(06:57) Lemma 1: _JS(S||R) \\leq \\epsilon_1_
(07:10) Lemma 2: _\\delta(Q,R) \\leq \\sqrt{\\epsilon_1} + \\sqrt{\\epsilon_2}_
(07:20) Proof of (2)
(07:32) (3) _\\epsilon^{\\Gamma}_{med} \\leq (2\\sqrt{\\epsilon_1} + \\sqrt{\\epsilon_{med}})^2_
(07:37) Proof of (3)
(07:48) Results
(08:33) Bonus
The original text contained 1 footnote which was omitted from this narration.
---
First published:
October 31st, 2025
Source:
https://www.lesswrong.com/posts/JXsZRDcRX2eoWnSxo/resampling-conserves-redundancy-and-mediation-approximately
---
Narrated by TYPE III AUDIO.