Data Driven cover image

Data Driven

Latest episodes

undefined
Apr 22, 2025 • 58min

Jacob Leverich on Efficiency, Elegance, and the Joy of Not Grepping log files at 2AM

This week, Frank sat down with Dr. Jacob Leverich—Stanford PhD, cofounder of Observe, and a veteran of the Google MapReduce team and Splunk. Jacob’s journey, from tinkering with video game code as a kid, to innovating at the cutting edge of distributed systems and energy efficiency, is as inspiring as it is informative.Key TakeawaysEarly Tech Roots: Hear how curiosity with QBasic and classic PCs (think IBM PCXT and Commodore) put Jacob on a path to high-impact data engineering.MapReduce, Dremel, & the Rise of Big Data: Jacob pulls back the curtain on working with some of the most influential data processing tools at Google and how these systems shifted the entire data landscape (hello, BigQuery!).Building Efficient Systems: It’s not just about scale—energy efficiency and performance optimization are the unsung heroes of today’s data infrastructure. Jacob explains why making things “just work” isn’t enough anymore.The Realities of Ops & Observability: Remember the days of grepping logs at 2AM? There’s a better way. Jacob shares how platforms like Observe help teams consolidate, visualize, and act on operational data—turning chaos into actionable insight.Bridging Data & Ops: The lines between data observability and traditional ops are blurring, and Jacob’s unique experience shows how best practices from data warehousing are finally making ops smoother (and less sleepless).Power Concerns & the Future: As data grows, so does energy consumption in data centers. Find out why optimization isn’t just good for performance—it’s key to sustainability.Timestamps00:00 Interview with Jacob Levrich05:59 Journey into Game Programming06:43 "Pursuing Fast Video Game Code"10:23 Data Processing and Power Efficiency16:11 Snowflake's Transformative Database Approach19:18 Journey to Data Management Industry21:37 Data Products: Solving Core Challenges27:07 Early Web Log Analysis Techniques28:57 Consolidating Data for Efficiency33:23 Specialized Tools and Context Switching35:43 Unique Dual-Expertise in Tech38:58 User-Centric Business Strategies42:13 IP Data Analysis in Cloud47:23 Electricity Transport Upsets Local Farms48:25 Shift to Parallel Computing52:10 Hardware Specialization & Software Optimization57:32 "Stay Data Driven"
undefined
Apr 14, 2025 • 58min

István Mészáros on going From CERN to Startup & The Cat That Launched a Thousand Queries

Welcome to another insightful episode of Data Driven! Today, we're diving into the world of warehouse-native analytics with our special guest, István Mészáros, cofounder of Mitsu. Join us as we explore how Mitsu empowers startups and enterprises with a new approach to data analytics. From his beginnings as a CERN physicist to becoming an open-source evangelist and finally a startup founder, István shares his unique journey through the data industry.We'll discuss the motivation behind Mitsu's distinct branding, reminiscent of Hello Kitty, and why standing out in today's crowded market is crucial. István also reveals the challenges and strategies of building a data company in Europe, and how Mitsu simplifies analytics by offering a self-service solution without the high costs associated with existing market leaders.Timestamps0:00 Introducing István Mészáros05:30 Shifting Open Source to SaaS07:46 Lava-Themed Compliance Solutions Brand10:27 Tech Branding and Hello Kitty Insights13:46 Optimizing Conversion in Data-Heavy Travel16:31 Self-Service Analytics Tool Needed19:17 Automated Product Analytics Tool23:20 "Budget Constraints and DIY Solutions"28:17 Freelancer's Efficient Data Solutions29:08 Open Source Tool Productization Plan33:13 Navigating Freelance and Startup Challenges37:19 Transitioning to Data Engineering42:25 Instant Feedback in Hobbies43:46 Embracing Feedback in Business Transformation49:13 "Hoping AI Takes Over Hiring"51:58 Visit Site for Info & Contact55:22 "Parenting Boys with Earbuds"57:25 "Data Driven: Quantum Podcast Relaunch"
undefined
Apr 1, 2025 • 54min

Barr Moses on How Data Observability Can Save Your Company Millions

On this episode of Data Driven, we welcome Barr Moses, CEO and co-founder of Monte Carlo, as she delves into the fascinating world of data observability. Join hosts Frank La Vigne and Andy Leonard as they explore how reliable data is crucial for making sound business decisions in today's tech-driven world. Learn why a simple schema change at Unity resulted in a $100 million loss and how Monte Carlo is developing cutting-edge solutions to prevent similar disasters. From discussions on ensuring data integrity to the intriguing potential of AI in anomaly detection, Barr Moses shares insights that might just redefine your understanding of data's role in business. Tune in for a podcast that not only uncovers the nuances of data reliability but also touches on the quirky side of tech, like why, according to Google, you should never use superglue to fix slipping cheese on your pizza.Moments00:00 Monte Carlo: Data Reliability Innovator05:45 "Data & AI Observability Engineering"09:42 Data Industry's Growing Importance12:00 Cereal Supply Chain Data Optimization16:03 Data Observability and Lineage19:29 GenAI Uncertainties and Latency Concerns23:17 "Human Oversight in AI Accuracy"24:12 Data Observability and Human Role28:01 Adapting to Customer Language33:29 Data and Security Management Alignment35:20 Data Reliability and Observability Challenges38:17 Automated Code Analysis Tool Launch42:29 Data-Inspired Childhood44:12 Passionate About Impactful Work48:52 LinkedIn Security Concerns Highlighted53:19 "Data Observability Insights"
undefined
Mar 4, 2025 • 45min

Sanjay Annadate on Data Driven Digital Transformation

In this episode, Sanjay joins Frank for a deep dive into the heart of digital transformation and AI-powered automation. Here are some of the key takeaways:Digital Transformation Evolution: Sanjay reflects on his nearly three-decade journey witnessing the digital shift from infancy to the AI-driven present. He outlines the critical components of digital transformation, including cloud adoption and data prioritization, noting significant changes in business focus over recent years.Microsoft's Role: Sanjay provides insights into Microsoft's strategic investments in digital transformation technologies, emphasizing their pivotal role in influencing market trends and industry-specific capabilities.AI-Powered Enhancements: From the widespread adoption of Copilot to the burgeoning concept of agentic AI, Sanjay discusses how AI tools are not replacing but augmenting the productivity of data engineers, offering a glimpse into the future of business processes.Edge of Innovation: We explore how Microsoft Fabric and other technologies are simplifying complex architectures, allowing businesses to leverage multi-cloud strategies effectively, keeping them at the forefront of innovation.Real-Life Impact: Sanjay shares compelling examples, like reducing sales briefing preparation time from four days to two minutes, showcasing the transformative power of AI in real business scenarios.Whether you're a data engineer, business leader, or just someone fascinated by the data-driven world, this episode is packed with valuable insights.Moments00:00 Three Decades of Digital Transformation05:27 Microsoft's Digital Transformation Dominance09:37 Microsoft's Cloud Integration Advantage13:22 Red Hat AI's Open Source Approach15:33 Microsoft Fabric's Multi-Cloud Integration Strategy20:01 "Custom Solutions for Complex Queries"21:39 Content Creation Efficiency Unlocked26:38 Sales Role Dependency Reduction Tool30:06 Agentic AI and Workflow Transformation33:29 "Beyond Basic Automation"35:05 AI's Impact on Business Expansion39:58 Data-Driven Problem Solving Impact41:58 Reading Trends in Data Innovation
undefined
Feb 25, 2025 • 54min

Trevor Schulze on How CIO’s Can Drive AI Strategy

In this episode, Andy Leonard and Frank La Vigne are thrilled to be joined by Trevor Schulze, the Chief Information Officer at Alteryx. Trevor brings an unparalleled perspective on digital transformation, drawing from his impressive tenure at industry giants such as Micron, Cisco, and RingCentral.Time stamps00:00 "Data Driven: AI & CIO Insights"04:32 CIO's Role in AI Evolution06:50 CIO's Evolving Role with AI11:43 "Embracing Data Democratization"16:24 Democratizing Data Access19:33 "AI Investment and Optimization Cycle"20:55 AI Enhances Tool Configuration Guidance24:42 Breaking Free from Vendor Lock-In27:41 "Unleashing Shadow AI and Technical Debt"31:53 Digital Performance Essential for All Industries34:01 Data Privacy Concerns in AI Use37:30 AI Democratization Challenges for Enterprises42:15 AI Transforming Business Processes43:55 Data-Driven Career Journey47:13 "Building Trust in Data Analytics"52:34 Building Trust in Future Tech
undefined
Feb 6, 2025 • 60min

Lillian Pierson on Revolutionizing Growth Marketing with AI

Andy Leonard and Frank La Vigne delve into the exciting world of AI and growth marketing with the renowned Lillian Pierson. Lillian, a globally recognized AI growth strategist and author. She shares her unique journey from engineering to data science and her role as a fractional CMO. She provides deep insights into leveraging AI to revolutionize marketing and growth strategies, discusses breaking down the barriers in early data science, and explores the rise of agentic AI. This conversation is filled with valuable knowledge, humor, and a reality check on the evolving tech landscape. Tune in to explore how AI and data-driven approaches are transforming industries and why Data Driven is a top pick for AI enthusiasts.Moments00:00 "Interview with AI Expert Lillian Pearson"04:18 Earning a Professional Engineering License09:21 Evolution of Data Science Disciplines11:08 Career Pivot to Success14:01 Data Strategy and AI Insights19:19 Marketing's Role in Product Growth21:58 Customer Advocacy in Product Development26:16 Exploring AI for Content Automation28:28 OpenAI Trained on My Style30:51 Frank's Podcast Automation Expansion33:22 "Delegation vs. Self-Management Discussion"37:45 Decoupled, Resilient System Communication41:57 Clay-Powered Decision Tech Critique45:41 AI Is Essential in Business49:09 Debating with ChatGPT's Perspectives50:23 Google AI: Generative Podcast Tool56:11 Big Data Fallacies Explored
undefined
Jan 28, 2025 • 1h 2min

Dean Guida on AI Insights, Data Analytics, and Business Growth

Today, we've got an exciting episode lined up for you. Hosts Frank La Vigne and Bailey dive deep into the tech universe with Dean Guida, the CEO and founder of Infragistics. Dean brings his 35-year journey and expansive experience in technology to the table, reminiscing about the early days of software development and his transition into the data-driven world.In this conversation, you'll hear about the evolution of Infragistics from building UI components for Windows to creating sophisticated data analytics and AI tools. Dean also shares insights from his new book, "When Grit is Not Enough," focusing on how entrepreneurs can foster agile, data-driven learning organizations. Whether you're a seasoned developer, a budding entrepreneur, or someone fascinated by the intersection of AI and data, this episode promises a wealth of knowledge and inspiration.Join us as we explore technology old and new, from the bygone era of Windows 3.0 to the cutting-edge capabilities of AI today. Plus, hear Dean's personal journey of navigating through various technological and economic shifts over the decades. Make sure to tune in for a discussion that bridges the past, present, and future of tech innovation!Show Notes00:00 35 Years of UI/UX Innovation06:35 "Simplicity, Beauty, and Conversational AI"15:29 Enhancing User Trust Through Transparency19:52 AI-Driven Learning and OKR Management26:20 Kids Reflecting Tech Evolution27:12 "AI in Future Work Environments"33:14 "Data-Driven Leadership and Team Alignment"38:44 Entrepreneurship Beyond Grinding48:19 Contextual Understanding in AI Assistants51:57 Overprotected Generation's Communication Challenges54:55 Generational Impact of Pandemics01:00:47 "Data-Driven Podcast: Ranked 38"
undefined
Jan 21, 2025 • 52min

Arjun Patel on Vector Databases and the Future of Semantic Search

Today, we delve into the intriguing world of vector databases, retrieval augmented generation, and a surprising twist—origami.Our special guest, Arjun Patel, a developer advocate at Pinecone, will be walking us through his mission to make vector databases and semantic search more accessible. Alongside his impressive technical expertise, Arjun is also a self-taught origami artist with a background in statistics from the University of Chicago. Together with co-host Frank La Vigne, we explore Arjun’s unique journey from making speech coaching accessible with AI at Speeko to detecting AI-generated content at Appen.In this episode, get ready to unravel the mysteries of natural language processing, understand the impact of the attention mechanism in transformers, and discover how AI can even assist in the art of paper folding. From discussing the nuances of RAG systems to sharing personal insights on learning and technology, we promise a session that’s both enlightening and entertaining. So sit back, relax, and get ready to fold your way into the fascinating layers of AI with Arjun Patel on Data Driven.Show Notes00:00 Arjun Patel: Bridging AI & Education04:39 Traditional NLP and Geometric Models08:40 Co-occurrence and Meaning in Text13:14 Masked Language Modeling Success16:50 Understanding Tokenization in AI Models18:12 "Understanding Large Language Models"22:43 Instruction-Following vs Few-Shot Learning26:43 "Rel AI: Open Source Data Tool"31:14 "Retrieval-Augmented Generation Explained"33:58 "Pinecone: Efficient Vector Database"37:31 "AI Found Me: Intern to Innovator"41:10 "Impact of Code Generation Models"45:25 Personalized Learning Path Technology46:57 Mathematical Complexity in Origami Design50:32 "Data, AI, and Origami Insights"
undefined
Jan 14, 2025 • 53min

Niv Braun on AI Security Measures and Emerging Threats

 In today's episode, we're thrilled to have Niv Braun, co-founder and CEO of Noma Security, join us as we tackle some pressing issues in AI security.With the rapid adoption of generative AI technologies, the landscape of data security is evolving at breakneck speed. We'll explore the increasing need to secure systems that handle sensitive AI data and pipelines, the rise of AI security careers, and the looming threats of adversarial attacks, model "hallucinations," and more. Niv will share his insights on how companies like Noma Security are working tirelessly to mitigate these risks without hindering innovation.We'll also dive into real-world incidents, such as compromised open-source models and the infamous PyTorch breach, to illustrate the critical need for improved security measures. From the importance of continuous monitoring to the development of safer formats and the adoption of a zero trust approach, this episode is packed with valuable advice for organizations navigating the complex world of AI security.So, whether you're a data scientist, AI engineer, or simply an enthusiast eager to learn more about the intersection of AI and security, this episode promises to offer a wealth of information and practical tips to help you stay ahead in this rapidly changing field. Tune in and join the conversation as we uncover the state of AI security and what it means for the future of technology.Quotable Moments00:00 Security spotlight shifts to data and AI.03:36 Protect against misconfigurations, adversarial attacks, new risks.09:17 Compromised model with undetectable data leaks.12:07 Manual parsing needed for valid, malicious code detection.15:44 Concerns over Agiface models may affect jobs.20:00 Combines self-developed and third-party AI models.20:55 Ensure models don't use sensitive or unauthorized data.25:55 Zero Trust: mindset, philosophy, implementation, security framework.30:51 LLM attacks will have significantly higher impact.34:23 Need better security awareness, exposed secrets risk.35:50 Be organized with visibility and governance.39:51 Red teaming for AI security and safety.44:33 Gen AI primarily used by consumers, not businesses.47:57 Providing model guardrails and runtime protection services.50:53 Ensure flexible, configurable architecture for varied needs.52:35 AI, security, innovation discussed by Niamh Braun.
undefined
Dec 24, 2024 • 1h 35min

*Live* Tis the Season for SSIS

In this livestream, Frank and Andy discuss the timeless nature of backend enterprise tech, that, much like a Christmas special from decades ago, is still very much celebrated.Moments00:00 Exploring SSIS future in a festive episode.08:28 Data engineering evolved from business intelligence systems.10:57 Social networks project before Facebook's popularity.19:19 SSIS training informed data engineering concepts teaching.24:56 Bill Gates moved project to immature Microsoft tooling.29:10 Data engineering possible in 2024 using T-SQL.35:23 Huge cloud companies surpass previous brick-and-mortar giants.40:10 Old technologies endure; misconceptions about their age.46:03 Evaluate change benefits: technical ease, business growth.52:30 Cloud departure interests rise, SSIS assistance sought.55:47 Big government agency utilizing diverse cloud platforms.01:00:59 Security is crucial; clients' preferences vary.01:08:56 Certification issues hinder software updates and compliance.01:10:02 People stick with older systems for reasons.01:15:15 Proper GPU driver drastically improved loading time.01:22:16 Repost increased engagement and communication with author.01:25:45 Data scientists should learn SQL for simplicity.01:31:06 Obsolete systems cause issues without quotes.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner