

Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739
72 snips Jul 15, 2025
Kwindla Kramer, co-founder and CEO of Daily, shares his insights on building real-time conversational voice AI. He discusses the full stack of voice agents, emphasizing the importance of a modular approach for better latency and cost-efficiency. Kwindla delves into challenges like interruption handling and natural dialogue dynamics. He also highlights the future of voice AI in use cases, hybrid edge-cloud pipelines, and exciting advancements like real-time video avatars. It's a comprehensive look at the dynamic world of voice technology!
AI Snips
Chapters
Transcript
Episode notes
Voice AI Spark and Pipecat Origin
- Kwindla Kramer got convinced voice AI is central after GPT-4 enabled new ways for humans to talk to computers.
- Daily open-sourced their internal tools, creating the popular Pipecat voice agent framework.
Understand Voice AI Stack
- For voice AI, understand the stack: models, APIs, orchestration, and application code.
- Use frameworks like Pipecat to manage multi-turn conversations and interruption handling with vendor neutrality.
Enterprise Voice AI Proves Viability
- Enterprise voice AI has proven viable with multi-model systems handling phone calls reliably.
- Consumer demos like ChatGPT voice are products in progress, lacking full interruption and noise management needed for real use.