The AI Fundamentalists

Dr. Andrew Clark & Sid Mangalik

A podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses.

Episodes

Mentioned books

Aug 8, 2023 • 32min

Synthetic Data in AI

Episode 5. This episode about synthetic data is very real. The fundamentalists uncover the pros and cons of synthetic data; as well as reliable use cases and the best techniques for safe and effective use in AI. When even SAG-AFTRA and OpenAI make synthetic data a household word, you know this is an episode you can't miss.Show notesWhat is synthetic data? 0:03Definition is not a succinct one-liner, which is one of the key issues with assessing synthetic data generation.Using general information scraped from the web for ML is backfiring.Synthetic data generation and data recycling. 3:48OpenAI is running against the problem that they don't have enough data and the scale at which they're trying to operate.The poisoning effect that happens when trying to take your own data.Synthetic data generation is not a panacea. It is not an exact science. It's more of an art than a science.The pros and cons of using synthetic data. 6:46The pros and cons of using synthetic data to train AI models, and how it differs from traditional medical data.The importance of diversity in the training of AI models.Synthetic data is a nuanced field, taking away the complexity of building data that is representative of a solution.Differences between randomized and synthetic data. 9:52Differential privacy is a lot more difficult to execute than a lot of people are talking about.Anonymization is a huge piece of the application for the fairness bias, especially with larger deployments.The hardest part is capturing complex interrelationships. (i.e. Fukushima reactor testing wasn't high enough)The pros and cons of ChatGPT. 13:54Invalid use cases for synthetic data in more depth,Examples where humans cannot anonymize effectivelyCreating new data for where the company is right now before diving into the use cases; i.e. differential privacy.Mentally meaningful use cases for synthetic data. 16:38Meaningful use cases for synthetic data, using the power of synthetic data correctly to generate outcomes that are important to you.Pros and cons of using synthetic data in controlled environments.The fallacy of "fairness through awareness". 18:39Synthetic data is helpful for stress testing systems, edge case scenario thought experiments, simulation, stress testing system design, and scenario-based methodologies.The recent push to use synthetic data.Data augmentation and digital twin work. 21:26 Synthetic data as the only data is where the difficulties arise.Data augmentation is a better use case for synthetic data.Examples of digital twin methodology to createWhat did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Jul 25, 2023 • 27min

Modeling with Christoph Molnar

Episode 4. The AI Fundamentalists welcome Christoph Molnar to discuss the characteristics of a modeling mindset in a rapidly innovating world. He is the author of multiple data science books including Modeling Mindsets, Interpretable Machine Learning, and his latest book Introduction to Conformal Prediction with Python. We hope you enjoy this enlightening discussion from a model builder's point of view.To keep in touch with Christoph's work, subscribe to his newsletter Mindful Modeler - "Better machine learning by thinking like a statistician. About model interpretation, paying attention to data, and always staying critical."SummaryIntroduction. 0:03Introduction to the AI fundamentalists podcast.Welcome, Christopher MolnarWhat is machine learning? How do you look at it? 1:03AI systems and machine learning systems.Separating machine learning from classical statistical modeling.What’s the best machine learning approach? 3:41Confusion in the space between statistical learning and machine learning.The importance of modeling mindsets.Different approaches to using interpretability in machine learning.Holistic AI in systems engineering.Modeling is the most fun part but also the beginning. 8:19Modeling is the most fun part of machine learning.How to get lost in modeling.How can we use the techniques in interpretable ML to create a system that we can explain to stakeholders that are non-technical? 10:36How to interpret at the non-technical level.Reproducibility is a big part of explainability.Conformal prediction vs. interpretability tools. 12:51Explanability to a data scientist vs. a regulator.Interoperability is not a panacea.Conformal prediction with Python.Roadblocks to conformal prediction being used in the industry.What’s the best technique for a job in data science? 17:20The bandwagon effect of Netflix and machine learning.The mindset difference between data science and other professions.Machine learning is always catching up with the best practices in the industry. 19:21The machine learning industry is catching up with best practices.Synthetic data to fill in gaps.The barrier to entry in machine learning.How to learn from new models.How to train your mindset before you start modeling. 23:52The importance of simplifying two different mindsets.Introduction to conformal prediction with Python.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Jun 27, 2023 • 37min

Why data matters | The right data for the right objective with AI

Episode 3. Get ready because we're bringing stats back! An AI model can only learn from the data it has seen. And business problems can’t be solved without the right data. The Fundamentalists break down the basics of data from collection to regulation to bias to quality in AI. Introduction to this episodeWhy data matters.How do big tech's LLM models stack up to the proposed EU AI Act?How major models such as Open AI and Bard stack up against current regulations.Stanford HAI - Do Foundation Model Providers Comply with the Draft EU AI Act?Risk management documentation and risk management.The EU is adding teeth outside of the banking and financial sectors now.Time - Exclusive: OpenAI Lobbied the E.U. to Water Down AI RegulationBringing stats back: Why does data matter in all this madness?How AI is taking us away from human intelligence.Having quality data and bringing stats back!The importance of having representative data, sampling dataWhat are your business objectives? Don’t just throw data into it.Understanding the use case of the data.GDPR and EU AI regulations.AI field caught off guard by new regulations.Expectations for regulatory data.What is data governance? How do you validate data?Data management, data governance, and data quality.Structured data collection for financial companies.What else should we learn about our data collection and data processes?Example: US Census data collection and data processes.The importance of representativeness and being representative of the community in the census.Step one, the fine curation of data, the intentional and knowledgeable creation of data that meets the specific business need.Step two, fairness through awareness.The importance of data curation and data selection in data quality.What data quality looks like at a high level.Rights to be forgotten.The importance of data provenance and data governance in data science.Synthetic data and privacy.Data governance seems to be 40 % of the path to AI model governance. What else needs to be in place?What companies are missing with machine learning.The impact that data will have on the future of AI.The future of general AI in the future.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

May 31, 2023 • 31min

Truth-based AI: LLMs and knowledge graphs - back to basics

Truth-based AI: Large language models (LLMs) and knowledge graphs - The AI Fundamentalists, Episode 2Show NotesWhat’s NOT new and what is new in the world of LLMs. 3:10 Getting back to the basics of modeling best practices and rigor.What is AI and subsequently LLM regulation going to look like for tech organizations? 5:55Recommendations for reading on the topic.Andrew talks about regulation, monitoring, assurance, and alarm.What does it mean to regulate generative AI models? 7:51Concerns with regulating generative AI models.Concerns about the call for regulation from Open AI.What is data privacy going to look like in the future? 10:16Regulation of AI models and data privacy.The NIST AI Risk Management Framework.Making sure it's being used as a productivity tool.How it's different from existing processes.What’s different about these models vs old models? 15:07Public perception of new machine learning models vs old models.Hallucination in the field.Does the use of chatbots change the tendency toward hallucinations? 17:27Bing still suffers from the same problem with their LLMs.Multi-objective modeling and multi-language modeling.What does truth-based AI look like? 20:17Public perception vs. modeling best practicesKnowledge graphs vs. generative AI: ideal use cases for eachAlgorithms have a really interesting potential application which is a plugin library model. 23:00Algorithms have an interesting potential application.The benefits of a plugin library model.What’s the future of large language models? 25:35Practical uses for ML and knowledge base knowledge databases.Predictions on ML and ML-based databases.Finding a way to make LLM useful.Next episodes of the podcast.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

May 11, 2023 • 26min

Why AI Fundamentals? | AI rigor in engineering | Generative AI isn't new | Data quality matters in machine learning

The AI Fundamentalists - Ep1 SummaryWelcome to the first episode. 0:03Welcome to the first episode of the AI Fundamentalists podcast.Introducing the hosts.Introducing Sid and Andrew. 1:23Introducing Andrew Clark, co-founder and CTO of Monitaur.Introduction of the podcast topic.What is the proper rigorous process for using AI in manufacturing? 3:44Large language models and AI.Rigorous systems for manufacturing and innovation.Predictive maintenance as an example of manufacturing. 6:28Predictive maintenance and predictive maintenance in manufacturing.The Apollo program and the Apollo program.The key things you can see when you’re new to running. 8:31The importance of taking a step back.Getting past the plateau in software engineering.What’s the game changer in these generative models? 10:47Can Chat-GPT become a lawyer, doctor, or teacher?The inflection point with generative models.How can we put guardrails in place for these systems so they know when to not answer? 13:46How to put guardrails in place for these systems.The concept of multiple constraints.Generative AI isn’t new, it’s embedded in our daily lives. 16:20Generative AI is not new, but not a new technology.Examples of generative AI.The importance of data in machine learning. 19:01The fundamental building blocks of machine learning.AI is revolutionary, but it's been around for years. What can AI learn from systems engineering? 20:59Nasa Apollo program, systems engineering.Systems engineering fundamentals world, rigor, testing and validating.Understanding the why, data and holistic systems management.The AI curmudgeons, the AI fundamentalists.What did you think? Let us know.Good AI Needs Great Governance Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner