

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

Aug 3, 2023 • 36min
ETL for LLMs
Founder of Unstructured, Brian Raymond, discusses challenges in data preprocessing for NLP solutions, efficient file processing architecture for data extraction, innovative data engineering solutions, comparison of connector capabilities in AirBite and 5trend, and evolution of ETL pipelines for Large Language Models.

12 snips
Jul 27, 2023 • 1h 1min
The Future of Graph Databases
Emil Eifrem is co-founder and CEO of Neo4j, the leading graph database and graph data science software provider. We discussed a range of topics including: the current state of graph databases, graph data science and graph neural networks, vector databases, the interplay between LLMs, knowledge graphs, and graph databases.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jul 20, 2023 • 38min
Delivering Safe and Effective LLM and NLP Applications
David Talby is the CTO and Founder of John Snow Labs, the company behind two popular open source projects: Spark NLP and LangTest. In this episode we focus on LangTest, an open-source Python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. [Note: After we recorded this episode, NLTest was renamed to LangTest.]Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jul 13, 2023 • 51min
Using Data and AI to Democratize Entity Resolution and Master Data Management
Jeff Jonas is Founder and CEO of Senzing, a startup focused on democratizing entity resolution – making this deceptively complicated task easy for programmers to use and deploy.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jul 6, 2023 • 49min
An Open Source Data Framework for LLMs
Jerry Liu is CEO and co-founder of LlamaIndex, an open source project and startup that builds tools that enable teams to augment LLMs with their own private data. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jun 29, 2023 • 50min
Redefining AI Infrastructure: Deploying and Developing with a Next-Generation Developer Platform
Tim Davis is the Co-Founder & Chief Product Officer of Modular, a startup focused on building tools to help simplify AI infrastructure.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jun 22, 2023 • 39min
The Rise of Custom Foundation Models
Andrew Feldman is CEO and co-founder of Cerebras, a startup that has released the fastest AI accelerator, based on the largest processor. We discussed Cerebras-GPT, a family of language models that have set new benchmarks for accuracy and compute efficiency, with sizes ranging from 111 million to 13 billion parameters.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jun 15, 2023 • 48min
The Future of Vector Databases and the Rise of Instant Updates
Louis Brandy is VP of Engineering at Rockset, the real-time search and analytics database startup formed by the creators of the popular open source project, RocksDB. Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

12 snips
Jun 8, 2023 • 45min
LLMs Are the Key to Unlocking the Next Generation of Search
Amin Ahmad, the co-founder of Vectara, has played a crucial role in developing a powerful API platform specifically tailored for developers.Subscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.

Jun 1, 2023 • 34min
Building and Deploying Foundation Models for Enterprises
Jonas Andrulis is the Founder & CEO Aleph Alpha, a startup that provides enterprise software solutions backed with their own large language models and multimodal modelsSubscribe to the Gradient Flow Newsletter: https://gradientflow.substack.com/Subscribe: Apple • Spotify • Stitcher • Google • AntennaPod • Podcast Addict • Amazon • RSS.Detailed show notes can be found on The Data Exchange web site.


