MLOps.community

Demetrios

Relaxed Conversations around getting AI into production, whatever shape that may come in (agentic, traditional ML, LLMs, Vibes, etc)

Episodes

Mentioned books

Jan 31, 2025 • 50min

Real LLM Success Stories: How They Actually Work // Alex Strick van Linschoten // #287

Alex Strick van Linschoten, a Machine Learning Engineer at ZenML with a PhD in History, delves into practical applications of large language models (LLMs). He shares insights from his comprehensive database on LLM use cases, emphasizing both common and innovative applications. The discussion covers the technical challenges of deploying LLMs, the significance of engineering practices, and the evolution of support bots using user behavior insights. Alex also calls for community contributions to enhance collective knowledge in this rapidly changing field.

Jan 27, 2025 • 1h 1min

Navigating Machine Learning Careers: Insights from Meta to Consulting // Ilya Reznik // #286

Ilya Reznik, an ML Engineering Thought Leader with 13 years at Meta, Adobe, and Twitter, shares his journey and insights on navigating machine learning careers. He discusses the limitations of traditional model fine-tuning and promotes innovative methods like prompt engineering. Ilya emphasizes the significance of practical applications from recent conferences and offers guidance for aspiring ML engineers aiming for senior roles. His rich experience blends technical expertise with practical career advice, making it a gem for those in the AI field.

Jan 24, 2025 • 53min

Collective Memory for AI on Decentralized Knowledge Graph // Tomaž Levak // #285

Tomaž Levak, Co-founder and CEO of Trace Labs, dives into the world of decentralized knowledge graphs and their role in AI. He discusses how these graphs enhance data integrity and privacy while promoting collaboration among organizations. Practical use cases in enterprise sectors are highlighted, showcasing their economic potential. Levak also explores the fusion of AI and personal health management, emphasizing innovative technologies that improve well-being. The conversation concludes with insights on the future of decentralized AI and its convergence with blockchain.

Jan 17, 2025 • 52min

Efficient Deployment of Models at the Edge // Krishna Sridhar // #284

In this engaging discussion, Krishna Sridhar, an engineering leader at Qualcomm and former co-founder of Tetra AI, dives into the efficient deployment of AI models at the edge. He shares insights on using Qualcomm AI Hub to optimize models for on-device performance, highlighting its application in real-time sports tracking and mobile photography. Krishna also explores the balance between hardware and software optimization in modern devices. Plus, he reveals how innovations in edge computing are transforming everyday AI applications while ensuring user privacy.

Jan 15, 2025 • 47min

Real World AI Agent Stories // Zach Wallace // #283

Zach Wallace, a Staff Software Engineer at Nearpod Inc., shares his expertise in AI integration within e-commerce and edtech. He discusses how AI agents enhance personalized user targeting and streamline data with tools like Redshift and DBT. The conversation delves into the challenges of maintaining AI systems, ensuring data quality, and the balance between specialization and cost in agent performance. Zach emphasizes the transformative potential of LLMs in education and the importance of educator involvement for effective AI tool development.

Jan 8, 2025 • 1h 5min

Machine Learning, AI Agents, and Autonomy // Egor Kraev // #282

Egor Kraev, Principal AI Scientist at Wise Plc and founder of the Swiss Pirate Party, dives into the transformative power of AI in fintech. He shares insights on integrating large language models into machine learning pipelines and the practical implications of his open-source MotleyCrew framework. Egor highlights the role of AI in improving fraud detection and optimizing currency flow. He also discusses the importance of autonomy within teams, navigating causal inference in marketing, and enhancing user engagement through targeted campaigns.

Jan 3, 2025 • 51min

Re-Platforming Your Tech Stack // Michelle Marie Conway & Andrew Baker // #281

In this discussion, Michelle Marie Conway, Lead Data Scientist at Lloyds Banking Group, and Andrew Baker, Data Science Delivery Lead, share insights from their cloud migration journey. They delve into the transition from on-prem technology to the cloud, highlighting the complexities of model management and engineering practices. Their conversation also touches on the harmony between music and technology, the challenges of chaos engineering in regulated environments, and the importance of collaboration within data science and platform teams.

Dec 23, 2024 • 58min

Holistic Evaluation of Generative AI Systems // Jineet Doshi // #280

In this insightful discussion, Jineet Doshi, an award-winning AI lead with over seven years at Intuit, dives deep into the complexities of evaluating generative AI systems. He emphasizes the importance of holistic evaluation to foster trust and the unique challenges posed by large language models. Jineet explores diverse evaluation methods, from classic NLP techniques to innovative strategies like red teaming. He also tackles the financial nuances of generative AI and the balance between human insight and automated feedback for robust assessments.

Dec 20, 2024 • 1h 15min

Unleashing Unconstrained News Knowledge Graphs to Combat Misinformation // Robert Caulk // #279

Robert Caulk, the founder of Emergent Methods and an expert in large-scale applications, discusses the cutting-edge development of unconstrained knowledge graphs to counter misinformation. He reveals how new tools allow for the processing of vast amounts of news data more efficiently. The podcast explores the integration of knowledge graphs with AI, enhancing user interaction and the fight against false narratives. Caulk emphasizes the ethical challenges of data handling and the role of advanced AI models in improving sentiment analysis, showcasing a future of responsible information management.

Dec 17, 2024 • 50min

LLM Distillation and Compression // Guanhua Wang // #278

Guanhua Wang, a Senior Researcher in the DeepSpeed team at Microsoft, dives into the revolutionary Domino training engine, designed to eliminate communication overhead during LLM training. He discusses the intricacies of naming the Phi-3 model and the growing interest in smaller language models. Wang highlights advanced techniques like data offloading and quantization, showcasing how Domino can speed up training by up to 1.3x compared to existing methods, while addressing privacy in customizable copilot models. It's a deep dive into optimizing AI training!

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner