TLDR: Voice AIs aren't that much cheaper in the year 2025
My friend runs a voice agent startup in Canada for walk-in clinics. The AI takes calls and uses tools to book appointments in the EMR (electronic medical record) system. In theory, this helps the clinic hire less front desk staff and the startup makes infinite money. In reality, the margins are brutal and they barely charge above cost. This is surprising to me: surely a living, breathing, squishy human costs more per hour than a GPU in a datacenter somewhere?
An industry overview of voice AIs
Broadly speaking there are 3 types of companies in the voice AI industry
- Foundation model companies:
- These companies actually train the text to speech and realtime audio models
- Openai, Elevenlabs, Cartesia
- Pipeline companies
- Infrastructure companies that aggregate multiple foundation model providers and help you experiment with multiple providers, build agents, and connect with SIP and WebRTC transports (think OpenRouter but with extra steps).
- Developer focused: N8n, Bland, Vapi
- Enterprise focused: Ada, Sierra, Fin
- Vertical startups
- Startups that do "voice agents for {healthcare | logistics | real estate | etc }"
- Here's 142 of them
Of course, these [...]
---
Outline:
(00:47) An industry overview of voice AIs
(01:57) The line by line breakdown
(02:22) Speech to Text (STT) = LLM = Text to Speech (TTS)
(03:09) Realtime API
(03:42) Comparison to Humans and Business Process Outsourcing (BPO)
(04:46) Assumptions
(05:20) Limitations
(06:40) The Future
(07:43) Conclusion
---
First published:
November 25th, 2025
Source:
https://www.lesswrong.com/posts/rJatmEDcYrDQcwstT/the-economics-of-replacing-call-center-workers-with-ais
---
Narrated by TYPE III AUDIO.
---
Images from the article:



Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.