

The Data Exchange with Ben Lorica
Ben Lorica
A series of informal conversations with thought leaders, researchers, practitioners, and writers on a wide range of topics in technology, science, and of course big data, data science, artificial intelligence, and related applications. Anchored by Ben Lorica (@BigData), the Data Exchange also features a roundup of the most important stories from the worlds of data, machine learning and AI. Detailed show notes for each episode can be found on https://thedataexchange.media/ The Data Exchange podcast is a production of Gradient Flow [https://gradientflow.com/].
Episodes
Mentioned books

Jun 13, 2024 • 36min
Fine-tuning and Preference Alignment in a Single Streamlined Process
Jiwoo Hong and Noah Lee from KAIST AI discuss their method ORPO, combining supervised fine-tuning and preference alignment in a single step. They highlight the advantages of their approach, such as minimal data requirement, bias prevention, and enhanced adaptability of language models. The Orpo method has received positive feedback from the research community and industry for efficient alignment and scaling models with smaller datasets.

Jun 6, 2024 • 25min
TinyML, Sensor-Driven AI, and Advances in Large Language Models
Pete Warden, founder of Useful Sensors, discusses the development of trustworthy AI for consumer electronics, advancements in Tiny Large Language Models and Sensor-Driven AI, the concept of Dark Compute, using CPUs and sensors for AI applications in consumer devices, and ways to engage in the TinyML and sensor-driven AI community.

May 30, 2024 • 50min
Machine Unlearning: Techniques, Challenges, and Future Directions
Ken Liu, a Ph.D. student at Stanford, discusses the concept of machine unlearning in AI models. They explore challenges like removing specific data points effectively, evaluating generative AI models, and linking privacy-preserving ML techniques with unlearning. The conversation delves into the evolution of unlearning techniques, highlighting the need for benchmarks and advanced methods for implementation.

May 23, 2024 • 39min
Unleashing the Power of AI Agents
Joao (Joe) Moura, founder of crewAI, discusses the simplicity of developing AI agents using large language models. They explore the use of AI agents in various tasks, emphasizing the importance of multi-agent architectures and potential for multimodal AI. The conversation delves into selecting suitable use cases for agent solutions, challenges of software engineering, and AI agents' role in enterprise processes. They also address concerns about prompt injection risks and upcoming features for AI projects.

May 16, 2024 • 42min
Monthly Roundup: Llama 3, Agents, Evaluation Metrics, Cyc, TikTok, and more
Paco Nathan, Founder of Derwen, talks about Llama 3 advancements, open foundation models, evolving AI agents, and the importance of data engineering. They discuss the limitations of leaderboards in evaluating AI models and touch upon the ethical implications of AI development.

May 9, 2024 • 43min
LLMs for Data Access: Unlocking Insights with Text-to-SQL
Guest Gunther Hagleither discusses text-to-SQL technology for data analytics, adoption challenges, RAG integration for better SQL, and future advancements in text-to-SQL systems.

May 2, 2024 • 54min
2024 Artificial Intelligence Index
Nestor Maslej discusses the 2024 AI Index Report, covering topics like benchmarks surpassing human capabilities, advancements in agentic AI research, debate between closed and open large language models, comparison of AI landscape in China and the US, complexities of synthetic data and responsible AI, and AI's impact on scientific problem-solving.

Apr 25, 2024 • 46min
DBRX and the Future of Open LLMs
Hagay Lupesko, Senior Director of Engineering at Databricks MosaicAI, discusses the innovative open LLM DBRX, bridging quality and cost efficiency. Topics include data control, collaboration in the AI community, model training, serving and optimizing, sustaining open source models, future plans for DBRX, hybrid RAG, tool utilization, knowledge graphs, and engagement opportunities with the open-source project.

Apr 18, 2024 • 37min
Monthly Roundup: New LLMs, GTC 2024, Constraint-Driven Innovation, Model Safety, and GraphRAG
Paco Nathan, Founder of Derwen, discusses updates on large language models and advancements in efficiency and scalability. Topics include Constraint-Driven Innovation, GTC 2024 highlights, and lessons from AI workload security exploits. Exciting discussions on model improvements, generative AI tools, and the importance of data engineering for AI safety.

Apr 11, 2024 • 36min
Automating Software Upgrades: How to Combine AI and Expert Developers
Steve Pike, Co-founder of Infield.ai, discusses automating software upgrades by blending AI and expert developers to address challenges like security fixes and bug updates. The conversation touches on the importance of combining automation with human expertise, utilizing data sources like GitHub, and the role of AI in speeding up software development processes.