I've been accepted as a mentor for the next AI Safety Camp. You can apply to work with me on the tiling problem. The goal will be to develop reflectively consistent UDT-inspired decision theories, and try to prove tiling theorems for them.
The deadline for applicants is November 17.
The program will run from January 11 to April 27. It asks for a 10 hour/week commitment.
My project description follows:
Summary
The Tiling Agents problem (aka reflective consistency) consists of analysing when one agent (the "predecessor") will choose to deliberately modify another agent (the "successor"). Usually, the predecessor and successor are imagined as the same agent across time, so we are studying self-modification. A set of properties "tiles" if those properties, when present in both predecessor and successor, guarantee that any self-modifications will avoid changing those properties.
You can think of this as the question of when agents will [...]
---
Outline:
(00:33) Summary
(02:10) The non-summary
(02:14) Motivation
(03:20) Tiling Overview
(05:13) Reflective Oracles
(06:41) Logical Uncertainty
(08:17) Value Uncertainty
(10:14) Value Plurality
(11:38) Ontology Plurality
(12:22) Cooperation and Coordination
---
First published:
November 1st, 2024
Source:
https://www.lesswrong.com/posts/7AzexLYpXKMqevttN/seeking-collaborators
---
Narrated by TYPE III AUDIO.