We've spent decades teaching ourselves to communicate with computers via text and clicks. Now, computers are learning to perceive the world like us: through sight and sound. What happens when software needs to sense, interpret, and act in real-time using voice and vision?
This week, Andrew sits down with Russ d'Sa, Co-founder and CEO of LiveKit, whose technology acts as the crucial infrastructure enabling machines to interact using real-time voice and vision, impacting everything from ChatGPT to critical 911 responses.
Explore the transition from text-based protocols to rich, real-time data streams. Russ discusses LiveKit's role in this evolution, the profound implications of AI gaining sensory input, the trajectory from co-pilots to agents, and the unique hurdles engineers face when building for a world beyond simple text transfers.
Check out:
Follow the hosts:
Follow today's guest(s):
Referenced in today's show:
Support the show:
Offers: