Anthropic researchers find that AI models can be trained to deceive
Jan 16, 2024
04:05
forum Ask episode
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
Most humans learn the skill of deceiving other humans. So can AI models learn the same? Yes, the answer seems — and terrifyingly, they’re exceptionally good at it.