

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

Aug 22, 2024 • 44min
Monthly Roundup: The Economic Realities of Large Language Models
Paco Nathan, founder of Derwen, dives into the latest advancements in large language models, notably the launch of LAMA 3.1 with its groundbreaking 400 billion parameters. He discusses the daunting financial challenges faced by AI developers, emphasizing the competition between startups and tech giants. The conversation also covers cutting-edge research on neural operators, the shift towards custom AI solutions, and vulnerabilities in AI software supply chains. Additionally, listeners are introduced to innovative tools like the Relic library and insights into the cultural impact of technology.

Aug 15, 2024 • 45min
From Hype to Reality: The Current State of Enterprise Generative AI Adoption
Evangelos Simoudis, Managing Director at Synapse Partners, dives into the current landscape of enterprise generative AI adoption. He discusses the cautious but optimistic investments by corporations, the hurdles in transitioning from experimentation to real-world applications, and the critical role of data quality. Simoudis highlights how generative AI enhances productivity in customer support and the complexities of integrating AI into existing processes. He also addresses the financial dynamics of AI investments and the importance of strategic differentiation for startups.

Aug 8, 2024 • 35min
Automating Unstructured Data Extraction with LLMs
Shuveb Hussain, co-founder of Unstract, discusses his innovative no-code platform that automates the extraction of structured data from unstructured documents. He highlights the rise of prompt engineers and their role in data transformation. The conversation dives into the complexities of using large language models and the critical importance of quality optical character recognition. Hussain also addresses the fine-tuning of language models for specific needs and the integration of diverse document types, showcasing how these advancements enhance data processing efficiency.

Aug 1, 2024 • 36min
Generative AI in Context: Hybrid Intelligence and Responsible Development
Alfred Spector, a distinguished expert in networked computing and former leader at IBM, Google, and Two Sigma, discusses pressing topics around generative AI and responsible development. He emphasizes the importance of context in data science to avoid critical pitfalls. The conversation dives into ethical AI practices, arguing for interdisciplinary education to navigate technological impacts. Spector also addresses the pressing need for AI literacy to promote effective integration and explores the challenges of regulating advanced AI amid rapid advancements.

Jul 25, 2024 • 46min
Monthly Roundup: Navigating the Peaks and Valleys of Generative AI Technology
Paco Nathan, founder of Derwen, discusses the latest in generative AI technology. Topics include Entronfic's Sonnet 3.5 release, managing risks in AI advancements, enhancing RAG models with graphs, accelerating protein evolution, weather model advancements, AI's role in mathematics, and shady AI practices with summer book recommendations.

Jul 18, 2024 • 35min
From Preparation to Recovery: Mastering AI Incident Response
Andrew Burt, co-founder of Luminos.Law and Luminos.ai, discusses AI incident response challenges and preparation. Topics include defining incidents in AI systems, specialized response teams, regulations like SB 1047, contrasting US and European approaches to AI regulation, and the importance of detecting and stopping AI failures.

Jul 11, 2024 • 50min
Unlocking the Power of Unstructured Data
CEO Chang She of LanceDB discusses the challenges and innovations in managing unstructured data for AI, including developing new data formats, optimizing AI training workloads, and enhancing applications with multimodal embeddings and vector search.

Jul 3, 2024 • 51min
Postgres: The Swiss Army Knife of Databases
Ajay Kulkarni and Mike Freedman, co-founders of Timescale, discuss how Postgres has evolved into a versatile platform for AI and vector databases. They explore the innovations in Postgres-like database technology, the significance of streaming post filtering, and the evolution of data formats and database usage, including embedding pipelines for AI applications and handling multimodal data.

Jun 27, 2024 • 44min
Supercharging AI with Graphs
Philip Rathle, CTO of Neo4j, discusses GraphRAG and GQL. Topics include Graph Neural Networks with LLMs, constructing knowledge graphs from various sources, using graphs in AI applications like supply chain risk analysis, benefits in healthcare and customer service, and integrating vector and graph databases for efficient data analysis.

Jun 20, 2024 • 37min
Monthly Roundup: SB 1047, GraphRAG, and AI Avatars in the Workplace
Paco Nathan, founder of Derwen, discusses SB 1047 for regulating AI, GraphRAG techniques, and AI avatars in the workplace. Topics include potential unintended consequences of AI regulation, limitations of integrating symbolic and statistical AI, challenges of AI avatars attending meetings, and advancements in graph analytics and machine learning.