
43 - David Lindner on Myopic Optimization with Non-myopic Approval
AXRP - the AI X-risk Research Podcast
00:00
Single-step versus multi-step hacks
They contrast single-step reward hacking (detectable) with multi-step tampering (harder), and MONA's role.
Play episode from 11:18
Transcript


