

The good, the bad, and the future of AI agents
403 snips Oct 2, 2025
David Hershey leads the Applied AI team at Anthropic and discusses the groundbreaking Claude Sonnet 4.5. He highlights the impressive strides AI agents have made in coding and persistent tasks, while also addressing their shortcomings. David dives into the unexpected applications of AI in legal work, the swift adaptability of Sonnet 4.5, and its remarkable performance in autonomous coding tasks. He also emphasizes the evolving engineering landscape, where collaboration with AI tools can redefine productivity.
AI Snips
Chapters
Transcript
Episode notes
Agents Shine Where Kinks Are Fixed
- Agents excel in domains where their blind spots have been ironed out, with coding being the clearest early win.
- Progress is fast but uneven across industries because each task contains many small failure points to fix.
Incorporate Specialist Knowledge Directly
- Learn directly from domain specialists by embedding their expertise into models rather than expecting generic data to suffice.
- Work with experts and customers to surface the specific gaps that prevent agents from succeeding in niche tasks.
Claude Rebuilt Claude.ai Overnight
- Anthropic asked Claude Sonnet 4.5 to recreate their consumer app Claude.ai and it built a working clone overnight.
- The model implemented a complex feature (Artifacts) by itself while the team watched it iterate for hours.