

The AI Fundamentalists
Dr. Andrew Clark & Sid Mangalik
A podcast about the fundamentals of safe and resilient modeling systems behind the AI that impacts our lives and our businesses.
Episodes
Mentioned books

Sep 20, 2023 • 24min
Digital twins in AI systems
Episode 7. To use or not to use? That is the question about digital twins that the fundamentalists explore. Many solutions continue to be proposed for making AI systems safer, but can digital twins really deliver for AI what we know they can do for physical systems? Tune in and find out.Show notesDigital twins by definition. 0:03Digital twins are one-to-one digital models of real-life products, systems, or processes, used for simulations, testing, monitoring, maintenance, or practice decommissioning.The digital twin should be indistinguishable from the physical twin, allowing for safe and efficient problem-solving in a computerized environment.Digital twins in manufacturing and aerospace engineering. 2:22Digital twins are virtual replicas of physical processes, useful in manufacturing and space, but often misunderstood as just simulations or models.Sid highlights the importance of identifying digital twin trends and distinguishing them from simulations or sandbox environments.Andrew emphasizes the need for data standards and ETL processes to handle different vendors and data forms, clarifying that digital twins are not a one-size-fits-all solution.Digital twins, AI models, and validation in a hybrid environment. 6:51Validation is crucial for deploying mission-critical AI models, including generative AI.Sid clarifies the misconception that AI models can directly replicate physical systems, emphasizing the importance of modeling specific data and context.Andrew and Susan discuss the confusion around modeling and its limitations, including the need to validate models on specific datasets and avoid generalizing across contexts.Referenced article from Venture Beat, 10 digital twin trends for 2023Digital twins, IoT, and their applications. 11:05Susan and Sid discuss the limitations of digital twins, including their inability to interact with the real world and the complexity of modeling systems.They reference a 2012 NASA paper that popularized the term "digital twin" and highlight the potential for confusion in its application to various industries.Sid: Digital twinning requires more than just IoT devices, it's a complex process that involves monitoring and IoT devices across the physical system to create a perfect digital twin.Andrew: Digital twins raise security and privacy concerns, especially in healthcare, where there are lots of IoT devices and personal data that need to be protected.Data privacy and security in digital twin technology. 17:03Digital twins and data privacy face off in IoT debate.Susan and Andrew discuss data privacy concerns with AI and IoT, highlighting the potential for data breaches and lack of transparency.Digital twins in healthcare and technology. 20:16Susan and Andrew discuss digital twins in various industries, emphasizing their impWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Aug 23, 2023 • 28min
Fundamentals of systems engineering
Episode 6. What does systems engineering have to do with AI fundamentals? In this episode, the team discusses what data and computer science as professions can learn from systems engineering, and how the methods and mindset of the latter can boost the quality of AI-based innovations.Show notes News and episode commentary 0:03ChatGPT usage is down for the second straight month.The importance of understanding the data and how it affects the quality of synthetic data for non-tabular use cases like text. (Episode 5, Synthetic data)Business decisions. The 2012 case of Target using algorithms in their advertising. (CIO, June 2023)Systems engineering thinking. 3:45The difference between building algorithms and building models, and building systems. The term systems engineering came from Bell Labs in the 1940s, and came into its own with the NASA Apollo program.A system is a way of looking at the world. There's emergent behavior, complex interactions and relationships between data.AI systems and ML systems are often distant from the expertise of people who do systems engineering.Learning the hard way. 9:25Systems engineering is about doing things the hard way, learning the physical sciences, math and how things work.What else can be learned from the Apollo program.Developing a system, and how important it is to align the importance of criticality and safety of the project.Systems engineering is often associated incorrectly with waterfall in software engineering, What is a safer model to build? 14:26What is a safer model, and how is systems engineering going to fit in with this world?The data science hacker culture can be counterintuitive to this approach For example, actuaries have a professional code of ethics and a set way that they learn.Step back and review your model. 18:26Peer review your model and see if they can break it and stress-test it. Build monitoring around knowing where the fault points are and also talk to business leaders.Be careful about the other impacts that can have on the business or externally on the people who start using it.Marketing this type of engineering as robustness of the model, identifying what it is good at and what it's bad at, and that in itself can be a piece of selling.Systems thinking gives a chance to create lasting models What did you think? Let us know.Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Aug 8, 2023 • 32min
Synthetic Data in AI
Episode 5. This episode about synthetic data is very real. The fundamentalists uncover the pros and cons of synthetic data; as well as reliable use cases and the best techniques for safe and effective use in AI. When even SAG-AFTRA and OpenAI make synthetic data a household word, you know this is an episode you can't miss.Show notesWhat is synthetic data? 0:03Definition is not a succinct one-liner, which is one of the key issues with assessing synthetic data generation.Using general information scraped from the web for ML is backfiring.Synthetic data generation and data recycling. 3:48OpenAI is running against the problem that they don't have enough data and the scale at which they're trying to operate.The poisoning effect that happens when trying to take your own data.Synthetic data generation is not a panacea. It is not an exact science. It's more of an art than a science.The pros and cons of using synthetic data. 6:46The pros and cons of using synthetic data to train AI models, and how it differs from traditional medical data.The importance of diversity in the training of AI models.Synthetic data is a nuanced field, taking away the complexity of building data that is representative of a solution.Differences between randomized and synthetic data. 9:52Differential privacy is a lot more difficult to execute than a lot of people are talking about.Anonymization is a huge piece of the application for the fairness bias, especially with larger deployments.The hardest part is capturing complex interrelationships. (i.e. Fukushima reactor testing wasn't high enough)The pros and cons of ChatGPT. 13:54Invalid use cases for synthetic data in more depth,Examples where humans cannot anonymize effectivelyCreating new data for where the company is right now before diving into the use cases; i.e. differential privacy.Mentally meaningful use cases for synthetic data. 16:38Meaningful use cases for synthetic data, using the power of synthetic data correctly to generate outcomes that are important to you.Pros and cons of using synthetic data in controlled environments.The fallacy of "fairness through awareness". 18:39Synthetic data is helpful for stress testing systems, edge case scenario thought experiments, simulation, stress testing system design, and scenario-based methodologies.The recent push to use synthetic data.Data augmentation and digital twin work. 21:26 Synthetic data as the only data is where the difficulties arise.Data augmentation is a better use case for synthetic data.Examples of digital twin methodology to createWhat did you think? Let us know.Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Jul 25, 2023 • 27min
Modeling with Christoph Molnar
Episode 4. The AI Fundamentalists welcome Christoph Molnar to discuss the characteristics of a modeling mindset in a rapidly innovating world. He is the author of multiple data science books including Modeling Mindsets, Interpretable Machine Learning, and his latest book Introduction to Conformal Prediction with Python. We hope you enjoy this enlightening discussion from a model builder's point of view.To keep in touch with Christoph's work, subscribe to his newsletter Mindful Modeler - "Better machine learning by thinking like a statistician. About model interpretation, paying attention to data, and always staying critical."SummaryIntroduction. 0:03Introduction to the AI fundamentalists podcast.Welcome, Christopher MolnarWhat is machine learning? How do you look at it? 1:03AI systems and machine learning systems.Separating machine learning from classical statistical modeling.What’s the best machine learning approach? 3:41Confusion in the space between statistical learning and machine learning.The importance of modeling mindsets.Different approaches to using interpretability in machine learning.Holistic AI in systems engineering.Modeling is the most fun part but also the beginning. 8:19Modeling is the most fun part of machine learning.How to get lost in modeling.How can we use the techniques in interpretable ML to create a system that we can explain to stakeholders that are non-technical? 10:36How to interpret at the non-technical level.Reproducibility is a big part of explainability.Conformal prediction vs. interpretability tools. 12:51Explanability to a data scientist vs. a regulator.Interoperability is not a panacea.Conformal prediction with Python.Roadblocks to conformal prediction being used in the industry.What’s the best technique for a job in data science? 17:20The bandwagon effect of Netflix and machine learning.The mindset difference between data science and other professions.Machine learning is always catching up with the best practices in the industry. 19:21The machine learning industry is catching up with best practices.Synthetic data to fill in gaps.The barrier to entry in machine learning.How to learn from new models.How to train your mindset before you start modeling. 23:52The importance of simplifying two different mindsets.Introduction to conformal prediction with Python.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Jun 27, 2023 • 37min
Why data matters | The right data for the right objective with AI
Episode 3. Get ready because we're bringing stats back! An AI model can only learn from the data it has seen. And business problems can’t be solved without the right data. The Fundamentalists break down the basics of data from collection to regulation to bias to quality in AI. Introduction to this episodeWhy data matters.How do big tech's LLM models stack up to the proposed EU AI Act?How major models such as Open AI and Bard stack up against current regulations.Stanford HAI - Do Foundation Model Providers Comply with the Draft EU AI Act?Risk management documentation and risk management.The EU is adding teeth outside of the banking and financial sectors now.Time - Exclusive: OpenAI Lobbied the E.U. to Water Down AI RegulationBringing stats back: Why does data matter in all this madness?How AI is taking us away from human intelligence.Having quality data and bringing stats back!The importance of having representative data, sampling dataWhat are your business objectives? Don’t just throw data into it.Understanding the use case of the data.GDPR and EU AI regulations.AI field caught off guard by new regulations.Expectations for regulatory data.What is data governance? How do you validate data?Data management, data governance, and data quality.Structured data collection for financial companies.What else should we learn about our data collection and data processes?Example: US Census data collection and data processes.The importance of representativeness and being representative of the community in the census.Step one, the fine curation of data, the intentional and knowledgeable creation of data that meets the specific business need.Step two, fairness through awareness.The importance of data curation and data selection in data quality.What data quality looks like at a high level.Rights to be forgotten.The importance of data provenance and data governance in data science.Synthetic data and privacy.Data governance seems to be 40 % of the path to AI model governance. What else needs to be in place?What companies are missing with machine learning.The impact that data will have on the future of AI.The future of general AI in the future.What did you think? Let us know.Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

May 31, 2023 • 31min
Truth-based AI: LLMs and knowledge graphs - back to basics
Truth-based AI: Large language models (LLMs) and knowledge graphs - The AI Fundamentalists, Episode 2Show NotesWhat’s NOT new and what is new in the world of LLMs. 3:10 Getting back to the basics of modeling best practices and rigor.What is AI and subsequently LLM regulation going to look like for tech organizations? 5:55Recommendations for reading on the topic.Andrew talks about regulation, monitoring, assurance, and alarm.What does it mean to regulate generative AI models? 7:51Concerns with regulating generative AI models.Concerns about the call for regulation from Open AI.What is data privacy going to look like in the future? 10:16Regulation of AI models and data privacy.The NIST AI Risk Management Framework.Making sure it's being used as a productivity tool.How it's different from existing processes.What’s different about these models vs old models? 15:07Public perception of new machine learning models vs old models.Hallucination in the field.Does the use of chatbots change the tendency toward hallucinations? 17:27Bing still suffers from the same problem with their LLMs.Multi-objective modeling and multi-language modeling.What does truth-based AI look like? 20:17Public perception vs. modeling best practicesKnowledge graphs vs. generative AI: ideal use cases for eachAlgorithms have a really interesting potential application which is a plugin library model. 23:00Algorithms have an interesting potential application.The benefits of a plugin library model.What’s the future of large language models? 25:35Practical uses for ML and knowledge base knowledge databases.Predictions on ML and ML-based databases.Finding a way to make LLM useful.Next episodes of the podcast.What did you think? Let us know.Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

May 11, 2023 • 26min
Why AI Fundamentals? | AI rigor in engineering | Generative AI isn't new | Data quality matters in machine learning
The AI Fundamentalists - Ep1 SummaryWelcome to the first episode. 0:03Welcome to the first episode of the AI Fundamentalists podcast.Introducing the hosts.Introducing Sid and Andrew. 1:23Introducing Andrew Clark, co-founder and CTO of Monitaur.Introduction of the podcast topic.What is the proper rigorous process for using AI in manufacturing? 3:44Large language models and AI.Rigorous systems for manufacturing and innovation.Predictive maintenance as an example of manufacturing. 6:28Predictive maintenance and predictive maintenance in manufacturing.The Apollo program and the Apollo program.The key things you can see when you’re new to running. 8:31The importance of taking a step back.Getting past the plateau in software engineering.What’s the game changer in these generative models? 10:47Can Chat-GPT become a lawyer, doctor, or teacher?The inflection point with generative models.How can we put guardrails in place for these systems so they know when to not answer? 13:46How to put guardrails in place for these systems.The concept of multiple constraints.Generative AI isn’t new, it’s embedded in our daily lives. 16:20Generative AI is not new, but not a new technology.Examples of generative AI.The importance of data in machine learning. 19:01The fundamental building blocks of machine learning.AI is revolutionary, but it's been around for years. What can AI learn from systems engineering? 20:59Nasa Apollo program, systems engineering.Systems engineering fundamentals world, rigor, testing and validating.Understanding the why, data and holistic systems management.The AI curmudgeons, the AI fundamentalists.What did you think? Let us know.Good AI Needs Great Governance
Define, manage, and automate your AI model governance lifecycle from policy to proof.Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.