

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

7 snips
Oct 31, 2024 • 30min
Cracking the Code: How Enterprises Are Adopting Generative AI
Tim Persons, an AI Leader at PwC specializing in next-generation audit and trust solutions, delves into the intricate world of generative AI adoption. He discusses how companies are cautiously implementing generative AI, focusing on internal applications first. The conversation highlights the increasing budgets and underestimated costs of deployment, emphasizing trust and cultural adaptation. Persons also stresses the importance of cross-functional collaboration, the necessity for workforce education, and learning by doing to navigate the evolving landscape of AI technologies.

5 snips
Oct 24, 2024 • 53min
Monthly Roundup: Ray Compiled Graphs, Llama 3.2 and Multimodal AI, and Structured Data for RAG
In this insightful conversation, Paco Nathan, founder of Derwen and an expert in Data and AI, explores groundbreaking innovations from the Ray Summit, focusing on Ray Compiled Graphs for GPU efficiency. He dives into the complexities of AI regulation and the implications of recent legislative actions in California. The dialogue also highlights the integration of structured and unstructured data, the significance of user annotations, and the competitive dynamics within AI, including the advances of the Llama 3.2 model and its multimodal capabilities.

Oct 17, 2024 • 41min
Reimagining Code: The AI-Driven Transformation of Programming and Data Analytics
Matt Welsh, a technical leader at Aryn AI and former Harvard professor famous for his connection to Mark Zuckerberg, discusses how AI is transforming programming and data analytics. He highlights the shift towards natural language coding, making programming accessible to non-techies. The conversation delves into the importance of human oversight in AI-generated code and the potential of AI to refine mentorship and ETL processes. Welsh also explores the challenges of working with knowledge graphs and emphasizes the need for robust evaluation tools in AI development.

Oct 10, 2024 • 51min
The Security Debate: How Safe is Open-Source Software?
Mars Lan, Co-founder and CTO of Metaphor, sheds light on the security challenges surrounding open-source software, debunking myths of its safety in critical industries. He discusses the complexities of dependency management, revealing common vulnerabilities in popular programming languages like Python and TypeScript. The conversation also dives into the contrasting security dynamics of open-source versus proprietary software and emphasizes accountability. Additionally, Lan highlights how Metaphor enhances data understanding and trust through innovative graph technologies.

Oct 3, 2024 • 60min
Generative AI in Voice Technology
Yishay Carmiel, CEO of Meaning, delves into the innovative world of generative AI in voice technology. He shares insights on real-time voice transformation and the emotional connections users can form with AI. The discussion highlights advancements in text-to-speech systems and the implications of deepfakes. Yishay emphasizes the ethical considerations surrounding voice cloning and the debate over open vs. closed-source technologies, while showcasing how these innovations are shaping customer support and human-computer interaction.

Sep 26, 2024 • 38min
Building An Experiment Tracker for Foundation Model Training
Aurimas Griciūnas, Chief Product Officer at Neptune.AI, dives into the complexities of training large language models and the critical need for effective experiment tracking. He discusses the transition from MLOps to LLMOps and how traditional tools struggle with the data demands of foundation models. Griciūnas highlights the challenges of operating massive GPU clusters and the importance of checkpoints for fault tolerance. The episode also covers breakthroughs in AI reasoning and the fine-tuning approaches essential for enterprises navigating this evolving landscape.

Sep 19, 2024 • 46min
Monthly Roundup: AI Regulations, GenAI for Analysts, Inference Services, and Military Applications
Paco Nathan, founder of Derwen, discusses pressing topics in AI and technology. The conversation dives into significant regulatory efforts like California's Senate Bill 1047 aimed at managing AI standards while fostering innovation. They explore how AI tools empower consumers against insurance claim denials and the legal challenges surrounding AI technologies. The podcast also highlights AI's impact on military strategies in Ukraine, ethical concerns about AI in warfare, and the necessity for flexible hardware and software integration in AI systems.

Sep 12, 2024 • 38min
Unlocking the Power of LLMs with Data Prep Kit
Petros Zerfos and Hima Patel, both from IBM Research, are key developers of Data Prep Kit, an open-source toolkit that facilitates data preparation for large language models. They discuss how DPK enhances the processing of raw text and code data, emphasizing its features like data cleansing and deduplication. The duo highlights its compatibility with cloud environments and vector databases. They also explore multimodal capabilities, showcasing its potential for processing diverse data types, including documents in multiple languages.

Sep 5, 2024 • 25min
Advancing AI: Scaling, Data, Agents, Testing, and Ethical Considerations
Dr. Andrew Ng, a leading AI visionary and founder of DeepLearning.AI, shares his insights on the transformative power of AI. He discusses the evolution of GPU technology and its pivotal role in data-centric AI. The conversation highlights the game-changing impact of large language models on user interactions and enterprise applications. Ng also addresses the future of reinforcement learning and the ethical considerations tied to AI deployment, emphasizing the need for a community-driven approach to innovation in the field.

Aug 29, 2024 • 48min
Bridging the Hardware-Software Divide in AI
Jay Dawani, CEO of Lemurian Labs, dives into the challenges of bridging hardware and software in AI development. He discusses how model size influences performance and hurdles in achieving artificial general intelligence. The conversation highlights the critical need for seamless integration between training and inference, as well as the complexities of AI deployment. Dawani also explores the future of supercomputing in AI and the importance of optimizing data representation, showcasing innovative strategies to enhance computational capabilities.