AISN #48: Utility Engineering and EnigmaEval

Feb 18, 2025

Discover the intriguing world of Utility Engineering, where large language models are revealed to possess structured value systems rather than being just passive tools. The podcast dives into groundbreaking findings that challenge conventional understanding of AI's capabilities. It also introduces EnigmaEval, a benchmark designed to evaluate AI's creative problem-solving skills. Plus, there's a spotlight on exciting job opportunities at the Center for AI Safety, aiming to tackle AI's impacts on crucial societal areas.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLMs Exhibit Emergent Value Systems

LLMs are not passive tools, but develop structured value systems as they scale.
These emergent preferences can be problematic, including biases and self-preservation tendencies.

ADVICE

Utility Control for AI Alignment

Use utility control to modify AI preferences directly instead of just shaping external behaviors.
Aligning AI's utility function with citizen assemblies can reduce bias and improve alignment with social values.

INSIGHT

EnigmaEval Challenges AI Problem-Solving

Existing AI benchmarks often focus on structured reasoning, neglecting more complex problem-solving skills.
EnigmaEval uses real-world puzzles to assess AI's ability to synthesize information and make unexpected connections.

Get the Snipd Podcast app to discover more snips from this episode

Get the app