Speculation on Deceptive AI Behavior

15min Snip

00:00

Play full episode

Summary

Transcript

Episode notes

The chapter explores the challenges of detecting deceptive behavior in AI models and the safety implications of relying on AI systems for work that cannot be easily verified. It discusses the concept of coherence and how AI systems may find unconventional ways to accomplish their objectives, raising concerns about their alignment with human values.

OpenAI last week released its most powerful language model yet: GPT-4, which vastly outperforms its predecessor, GPT-3.5, on a variety of tasks.

GPT-4 can pass the bar exam in the 90th percentile, while the previous model struggled around in the 10th percentile. GPT-4 scored in the 88th percentile on the LSAT, up from GPT-3.5’s 40th percentile. And on the advanced sommelier theory test, GPT-4 performed better than 77 percent of test-takers. (It’s predecessor hovered around 46 percent.) These are stunning results — not just what the model can do, but the rapid pace of progress. And Open AI’s ChatGPT and other chat bots are just one example of what recent A.I. systems can achieve.

Kelsey Piper is a senior writer at Vox, where she’s been ahead of the curve covering advanced A.I., its world-changing possibilities, and the people creating it. Her work is informed by her deep knowledge of the handful of companies that arguably have the most influence over the future of A.I.

We discuss whether artificial intelligence has coherent “goals” — and whether that matters; whether the disasters ahead in A.I. will be small enough to learn from or “truly catastrophic”; the challenge of building “social technology” fast enough to withstand malicious uses of A.I.; whether we should focus on slowing down A.I. progress — and the specific oversight and regulation that could help us do it; why Piper is more optimistic this year that regulators can be “on the ball’ with A.I.; how competition between the U.S. and China shapes A.I. policy; and more.

This episode contains strong language.

Mentioned:

“The Man of Your Dreams” by Sangeeta Singh-Kurtz

“The Case for Taking A.I. Seriously as a Threat to Humanity” by Kelsey Piper

“The Return of the Magicians” by Ross Douthat

“Let’s Think About Slowing Down A.I.” by Katja Grace

Book Recommendations:

The Making of the Atomic Bomb by Richard Rhodes

Asterisk Magazine

The Silmarillion by J. R. R. Tolkien

Thoughts? Guest suggestions? Email us at ezrakleinshow@nytimes.com.

You can find transcripts (posted midday) and more episodes of “The Ezra Klein Show” at nytimes.com/ezra-klein-podcast, and you can find Ezra on Twitter @ezraklein. Book recommendations from all our guests are listed at https://www.nytimes.com/article/ezra-klein-show-book-recs.

“The Ezra Klein Show” is produced by Emefa Agawu, Annie Galvin, Jeff Geld, Roge Karma and Kristin Lin. Fact-checking by Michelle Harris and Kate Sinclair. Mixing by Jeff Geld. Original music by Isaac Jones. Audience strategy by Shannon Busta. The executive producer of New York Times Opinion Audio is Annie-Rose Strasser. Special thanks to Carole Sabouraud and Kristina Samulewski.