Super Data Science: ML & AI Podcast with Jon Krohn

Jon Krohn
undefined
Oct 11, 2024 • 42min

826: In Case You Missed It in September 2024

Next-gen IDEs, efficiency-boosting open-source Python libraries, and changes in hiring for data scientists: This episode of In Case You Missed It gives you our best clips of September’s interviews, hosted by Jon Krohn.Additional materials: www.superdatascience.com/826Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Oct 8, 2024 • 1h 2min

825: Data Contracts: The Key to Data Quality, with Chad Sanderson

Data contracts are redefining data quality and governance, and Chad Sanderson, CEO of Gable.ai, joins host Jon Krohn to explain how they can transform your data strategy. He breaks down what data contracts are, how they shift data quality checks closer to production, and why they’re essential for reducing data debt. Chad also highlights how better alignment between data producers and consumers can elevate data reliability and tackle change-management challenges in modern organizations.This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn: What data contracts are and how they define expectations for data quality [03:16] What data contracts look like [09:09] The common misconceptions about data quality when implementing AI [12:55] Chad’s Chief Operator role at Data Quality Camp [19:46] How “shifting left” improves data reliability by addressing issues early [24:17] Why data professionals still struggle with data quality [30:31] How data debt forms and why it leads to complex, inefficient architectures [35:53] How will the role of human oversight evolve in ensuring data quality? [47:12] How can data teams leverage storytelling? [52:33] Additional materials: www.superdatascience.com/825
undefined
Oct 4, 2024 • 14min

824: Llama 3.2: Open-Source Edge and Multimodal LLMs

Llama 3.2 brings a new era of AI innovation with lightweight models tailored for on-device applications and powerful vision models for handling complex image inputs. Host Jon Krohn explores how this release pushes the boundaries of open-source AI, making it more accessible and versatile for developers. He also covers the Llama Stack toolkit, designed to streamline deployment, and Llama Guard 3, Meta’s latest content moderation solution. With extensive support from major cloud and hardware partners, Llama 3.2 is set to unlock groundbreaking possibilities for AI across mobile and beyond. Tune in to hear more.Additional materials: www.superdatascience.com/824Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Oct 1, 2024 • 1h 21min

823: Virtual Humans and AI Clones, with Natalie Monbiot

Virtual humans are rewriting the rules of digital communication and reshaping entire industries. This week, Jon Krohn welcomes Natalie Monbiot, Head of Strategy at Hour One, to shed light on how AI avatars are revolutionizing L&D and e-commerce by turning traditional training and product listings into captivating, presenter-led content.This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick, by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• How do you create a virtual being? [10:55]• Reid Hoffman's avatar [13:40]• The virtual human economy [31:07]• Virtual human societies [51:24]• Virtual humans and creative expression [56:35]• Challenges in maintaining transparency [01:00:22]Additional materials: www.superdatascience.com/823
undefined
Sep 27, 2024 • 19min

822: NotebookLM: Jaw-Dropping Podcast Episodes Generated About Your Documents

NotebookLM, Google’s latest AI tool, takes content creation to a new level. This week, Jon Krohn shares how the platform transformed his 200-page dissertation into a fascinating 11-minute podcast. Discover how AI can turn vast amounts of information into engaging and digestible content, opening up new possibilities for content creation.Additional materials: www.superdatascience.com/822Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Sep 24, 2024 • 1h 13min

821: The Skills You Need to Be an Effective Data Scientist, with Marck Vaisman

Marck Vaisman speaks to Jon Krohn about his paradigm for understanding core data practitioner types. Hear Marck detail the four data practitioner personas that he has identified in his research, why he believes the roadmaps that influencers like to promote as surefire ways to a data science career don’t work in practice, and why the term “data scientist” is still so elusive and hard to recruit for.This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• How Marck started his work in defining data science roles [08:06]• The relationship between the four data practitioner personas [15:26]• About Marck’s “menu” for effective data science [40:43]• How recruiters can hire the best data scientist for the job [59:31]Additional materials: www.superdatascience.com/821
undefined
Sep 20, 2024 • 27min

820: OpenAI's o1 "Strawberry" Models

Jon Krohn takes OpenAI’s new models (o1-preview and o1-mini) for a spin in this Five-Minute Friday, learning their key strengths and limitations, and how the o1 series may represent yet another landmark for generative AI.Additional materials: www.superdatascience.com/820Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Sep 17, 2024 • 1h 6min

819: PyTorch: From Zero to Hero, with Luka Anicin

SuperDataScience veteran and Udemy teacher Luka Anicin is on the podcast to talk about his brand-new course, “PyTorch: From Zero to Hero”, available exclusively on superdatascience.com. Host Jon Krohn asks Luka why he feels that every data scientist should consider PyTorch as their default Python library, and why “keeping it simple” can secure the success of a machine learning project.This episode is brought to you by AWS Inferentia and AWS Trainium, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• About the PyTorch library [03:29]• Why PyTorch became so popular [25:24]• How to increase accuracy and efficiency in PyTorch [31:49]• How to utilize transfer learning [35:44]• Why real-world projects are essential to data scientists [41:10]• About Datablooz [46:49]Additional materials: www.superdatascience.com/819
undefined
Sep 13, 2024 • 30min

818: In Case You Missed It in August 2024

Experts from AI and data science discuss the impact and benefits of decentralization, the importance of structuring AI systems in business, and why knowing the basics will always matter for data engineers. Listen to Shingai Manjengwa (episode 809), Daniel Hulme (episode 807), Jerry Yurchisin (episode 813) and Nick Elprin (episode 811) explore a future world of work that rewards continuing learners, sets tasks for the people best suited to complete them rather than those whose job titles reflect the spec, and applies a fleet of ‘AI agents’ to solve complex business tasks.Additional materials: www.superdatascience.com/818Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Sep 10, 2024 • 1h 36min

817: The Positron IDE, Tidy NLP and MLOps with Dr. Julia Silge

Dr. Julia Silge, Engineering Manager at Posit, introduces the brand-new Positron IDE, perfect for exploratory data analysis and visualization. She also lays out her top picks for LLMs that boost coding efficiency and discusses when traditional NLP methods might be the smarter choice over LLMs. Plus, Julia highlights some must-know open-source libraries that make managing MLOps easier than ever. Tune in for insights that every data scientist, ML engineer, and developer will find useful.This episode is brought to you by Gurobi, the Decision Intelligence Leader, and by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• Overview of Posit and Positron IDE [05:20]• How the needs of a data scientist differ from those of a software developer [10:54]• How to contribute to the open-source Positron [19:50]• MLOps and Vetiver: Tools for deploying and maintaining ML models [37:01]• Natural Language Processing (NLP) and the Tidyverse approach [50:34]• The role of AI and LLMs in data science education [1:24:18]Additional materials: www.superdatascience.com/817

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app