
The Data Exchange with Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Latest episodes

9 snips
Jul 17, 2025 • 38min
From Human-Readable to Machine-Usable: The New API Stack
Sagar Batchu, CEO of Speakeasy, dives into the revolution in API development, emphasizing the shift as AI agents become primary users. He discusses 'vibe coding' for creating adaptable APIs and the challenges of managing multiple APIs. The conversation touches on enhancing AI integration and the significance of tools like Speakeasy in simplifying API interactions. Batchu highlights the importance of multi-cloud platforms and robust security measures, alongside innovations in user experience designed for both technical and non-technical users.

Jul 10, 2025 • 42min
Why Voice Security Is Your Next Big Problem
Yishay Carmiel and Roy Zanbel, co-founders of Apollo Defend, dive into the rapidly evolving landscape of voice AI security. They discuss the alarming implications of voice cloning technology, emphasizing its potential misuse and the urgent need for protective measures. The conversation highlights advancements in human-like speech generation and the complexities of defending against deepfake audio attacks. With voice agents proliferating in customer service, they stress the necessity of robust security measures to safeguard personal authenticity and data privacy.

22 snips
Jul 3, 2025 • 28min
Unlocking Unstructured Data with LLMs
Shreya Shankar, a PhD student in EECS at UC Berkeley, dives into how Large Language Models (LLMs) are changing the game for unstructured enterprise data. She explains her innovative framework, DocETL, which streamlines semantic extraction and thematic analysis of text and PDFs. The conversation touches on the practical challenges of data extraction and the evolution towards multimodal processing with tools like DocWrangler. Shreya also highlights the importance of aligning user intent with model capabilities for better user experiences.

16 snips
Jun 26, 2025 • 31min
Building Production-Grade RAG at Scale
Douwe Kiela, Founder and CEO of Contextual AI and an adjunct professor at Stanford, delves into the relevance of Retrieval-Augmented Generation (RAG) amidst evolving AI contexts. He explains the shift to RAG 2.0, emphasizing its potential as an end-to-end trainable system. The conversation highlights the challenges of document understanding, the importance of structured information in extraction, and how hybrid retrieval methods can streamline data access. Douwe also speculates on future advancements in model fine-tuning, emphasizing the need for expert feedback and open-source contributions.

17 snips
Jun 19, 2025 • 45min
Unlocking AI Superpowers in Your Terminal
Zach Lloyd, Founder and CEO of Warp, discusses how his innovative terminal combines AI and developer insights to enhance productivity. He highlights the evolution of command-line interfaces, emphasizing the efficiency of terminal environments over graphical interfaces. The conversation delves into AI’s role in improving workflows and the collaborative capabilities of Warp Drive. Lloyd also addresses the needs of diverse software developers and advocates for AI tools as enhancements, not replacements, in software engineering, underscoring their potential for early-career developers.

52 snips
Jun 12, 2025 • 51min
From Vibe Coding to Autonomous Agents
Jackie Brosamer, Head of AI and Data Platform at Block, and Brad Axen, Tech Lead for AI and Data Platform, dive into Codename Goose—a pioneering open-source AI agent designed to automate complex engineering tasks. They discuss its flexible integration and how large language models are reshaping engineering workflows. The duo addresses the challenges of autonomous agents in context management and output validation. They also explore 'vibe coding,' the evolving role of developers, and the generational shifts in adapting to these transformative technologies.

Jun 5, 2025 • 49min
How a Public-Benefit Startup Plans to Make Open Source the Default for Serious AI
Manos Koukoumidis, CEO of Oumi Labs and former tech lead at Microsoft, Meta, and Google, shares his vision for unconditionally open foundation models in AI. He argues for transparency in data, code, and processes as essential for trustworthy technology. Discussions include the evolution of community-driven AI similar to Linux, the importance of a robust evaluation system in open-source contributions, and the innovative Halloumi tool for verifying AI claims. Manos emphasizes balancing innovation with safety as a pathway to reliable and accessible AI.

May 29, 2025 • 54min
The Highly Uncertain Future of OpenAI’s Dominance
Dan Schwarz, CEO and co-founder of Futuresearch, challenges OpenAI's ambitious $125 billion revenue target, citing intense competition and revenue pressures. He discusses the complexities of AI revenue strategies and the impact of user behavior on sustainability. The conversation highlights concerns over talent moving to rivals and contrasts early AI optimism with current market realities. Exploring future revenue streams, Schwarz expresses skepticism about the pace of growth, emphasizing the competitive landscape dominated by tech giants like Google and Meta.

6 snips
May 22, 2025 • 45min
Beyond Guardrails: Defending LLMs Against Sophisticated Attacks
Jason Martin, an AI Security Researcher at HiddenLayer, delves into the world of AI vulnerabilities and defenses. He illuminates the concept of 'policy puppetry,' a technique that can bypass safety features in language models. The conversation highlights the challenges of AI safety, particularly in multimodal applications, and the importance of robust security measures for enterprises. They also tackle the complex interplay of biases in LLMs and the critical role of instruction hierarchy in shaping AI responses, stressing the need for careful model selection to mitigate risks.

31 snips
May 15, 2025 • 50min
Navigating the Generative AI Maze in Business
Evangelos Simoudis, Managing Director at Synapse Partners, discusses navigating the generative AI maze in business. He highlights the current disparity between interest and actual AI deployment, addressing challenges in transitioning from proof-of-concept to full implementation. Topics include the decline in pilot projects, the importance of data strategies, and innovative architectures like Retrieval-Augmented Generation. Simoudis also examines the complexities of AI agents and stresses the need for human oversight in an increasingly AI-driven landscape.