undefined

abramdemski

Co-author of the LessWrong post "Why Don’t We Just… Shoggoth+Face+Paraphraser?"

Top 3 podcasts with abramdemski

Ranked by the Snipd community
undefined
Nov 19, 2024 • 27min

“Why Don’t We Just... Shoggoth+Face+Paraphraser?” by Daniel Kokotajlo, abramdemski

Daniel Kokotajlo and abramdemski delve into groundbreaking ideas on AGI safety. They propose a dual-model system where a 'shoggoth' handles internal reasoning while a 'face' interacts with users, enhancing transparency. Their discussion navigates the complex terrain of aligning AI with human values and the ethics of deceptive AI training. They emphasize the importance of truth-telling to prevent manipulation, while also examining the potential dangers of opaque cognition and the intricate training processes involved in sovereign AGI development.
undefined
Nov 12, 2024 • 9min

“AI Craftsmanship” by abramdemski

In this discussion, abramdemski, the author of the insightful LessWrong post on AI Craftsmanship, delves into the deficiencies of modern AI. He contrasts the profound work of Donald Knuth with today’s superficial AI designs, warning against using large language models as unquestioned sources of truth. The conversation also addresses the divide between AI safety advocates and engineers, stressing the importance of merging safety with thoughtful craftsmanship. By highlighting promising projects, he calls for deeper contemplation in the development of AI technology.
undefined
Nov 1, 2024 • 14min

“Seeking Collaborators” by abramdemski

Abram Demski, an AI Safety Camp mentor focused on the tiling problem, discusses his approach to developing reflectively consistent decision theories. He emphasizes the significance of Updateless Decision Theory (UDT) in AI safety. Demski invites collaborators to explore this complex problem, which involves self-modification and cooperative behavior among AI agents. He also touches on concepts like logical and value uncertainty, making a case for multidisciplinary collaboration to enhance safety in AI interactions.