"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis

Beyond Preference Alignment: Teaching AIs to Play Roles & Respect Norms, with Tan Zhi Xuan

37 snips

Nov 30, 2024

In this discussion, Tan Zhi Xuan, an MIT PhD student specializing in AI alignment, critiques traditional preference-based methods. They propose role-based AI systems shaped by social consensus, emphasizing the necessity of aligning AI with societal norms instead of mere preferences. The conversation touches on how AI can learn ethical standards through Bayesian reasoning and the exploration of self-other overlap to enhance cooperation. Xuan's innovative insights pave the way for a safer, more socially aware approach to AI development.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Role-Based AI Alignment

Preference-based AI alignment has limitations, including inconsistent human preferences and difficulty aggregating across populations.
Role-based AI systems, guided by social consensus on norms and constraints, offer a more robust approach.

INSIGHT

Bayesian Rule Induction for Norm Learning

AI agents can learn social norms through Bayesian rule induction by observing deviations from self-interested behavior.
This allows norms to emerge and sustain across agent generations, enabling cooperation and avoiding tragedies of the commons.

INSIGHT

Bridging Philosophy and Practice

Tan Zhi Xuan's work bridges philosophical theory with practical AI development.
Their background influences their distrust of Western philosophy as the sole source of knowledge for AI alignment.

Get the Snipd Podcast app to discover more snips from this episode

Get the app