
Super Data Science: ML & AI Podcast with Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Latest episodes

Mar 21, 2023 • 1h 17min
663: Astonishing CICERO negotiates and builds trust with humans using natural language
NLP, transformer architectures, and machines beating humans at their own game: Jon Krohn talks to Alexander H. Miller about his work in building a machine that can outsmart humans in the game of Diplomacy by engineering powers of persuasion and collusion to its own advantage.This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Training a natural language model to interact with Diplomacy players [05:07]• Processing speeds for a Diplomacy bot [29:32]• Using transformer architectures [37:25]• How Diplomacy AI actually works [43:25]• CICERO's potential real-world applications [55:28]• How to R&D an AI project [59:27]• How to become an AI Research Manager [1:06:12]Additional materials: www.superdatascience.com/663

Mar 17, 2023 • 8min
662: The Most Popular SuperDataScience Podcast Episodes of 2022
Our list of the top 10 SuperDataScience podcast episodes for 2022 is here. From Pandas to causality, AI breakthroughs and data storytelling, these were your most popular episodes of the year gone by.Additional materials: www.superdatascience.com/662Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Mar 14, 2023 • 1h 17min
661: Designing Machine Learning Systems
Chip Huyen, co-founder of Claypot AI and author of O'Reilly's best-selling "Designing Machine Learning Systems" is here to share her expertise on designing production-ready machine learning applications, the importance of iteration in real-world deployment, and the critical role of real-time machine learning in various applications. Technical listeners like data scientists and machine learning engineers will definitely enjoy this one!This episode is brought to you by Pathway, the reactive data processing framework (pathway.com), and by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Why Chip wrote 'Designing Machine Learning Systems' [08:58]• How Chip ended up teaching at Stanford [13:18]• About Chip's book 'Designing Machine Learning Systems' [21:12]• What makes ML feel like magic [30:53]• How to align business intent, context, and metrics with ML [37:55]• The lessons Chip learned about training data [42:03]• Chip's secrets to engineering good features [53:19]• How Chip optimizes her productivity [1:07:48]Additional materials: www.superdatascience.com/661

Mar 10, 2023 • 4min
660: Five Ways to Use ChatGPT for Data Science
ChatGPT is well-known for its potential to disrupt the writing industry, but in what other, perhaps less explored, ways can we use the tool? In this episode, Jon Krohn outlines five critical ways that ChatGPT can augment a data scientist’s work. From generating code to acting as a translation tool for programming languages, listen in to hear why ChatGPT could become a vital part of every data scientist’s toolkit.Additional materials: www.superdatascience.com/660Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Mar 7, 2023 • 1h 21min
659: Open-Source Tools for Natural Language Processing
NLP practitioners: this episode is for you. From the awareness of linguistic elements and annotation to getting the necessary people in the room, Vincent Warmerdam presents to Jon Krohn a recipe for a successful project and the open-source NLP tools to get there.This episode is brought to you by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• How Vincent came to work with De Speld [08:57]• Vincent’s role at Explosion [18:59]• How users can apply spaCy [21:46]• Prodigy: Annotate training data more efficiently with scripts [26:28]• How to manage “skill anxiety” with Calmcode [32:32]• How Vincent fixed bad labels [42:47]• The value of understanding linguistics for NLP [54:42]• How to constrain artificial stupidity [1:02:38]Additional materials: www.superdatascience.com/659

Mar 3, 2023 • 36min
658: How to Build Data and ML Products Users Love
What makes data products popular? Brian T. O'Neill, Founder and Principal of Designing for Analytics, returns to the podcast to help us crack the code on building data products that people love.Additional materials: www.superdatascience.com/658Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Feb 28, 2023 • 1h 10min
657: How to Learn Data Engineering
Data engineering educator Andreas Kretz joins Jon Krohn for a 1-hour primer that covers everything you need to know about the most in-demand role in data. From skills to tools, problem-solving processes and more, growing your knowledge of data engineering only improves your marketability, so tune in today if you're ready to future-proof your data career.This episode is brought to you by Glean (glean.io), the platform for data insights fast, and by epic LinkedIn Learning instructor Keith McCormick (linkedin.com/learning/instructors/keith-mccormick). Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Why learn data engineering? [06:55]• What is data engineering? [08:08]• What sets Senior Data Engineers apart from junior ones? [13:57]• The must-know data-engineering tools [20:26]• The right path to learn data engineering [44:24]• Are certifications worth it? [51:46]• The future of data engineering [55:24]• Andreas's career challenges [58:48]Additional materials: www.superdatascience.com/657

Feb 24, 2023 • 42min
656: A.I. Talent and the Red-Hot A.I. Skills
How to attract an AI recruiter’s attention: In this episode, Jon Krohn and Tribe AI CEO Jaclyn Rice Nelson break down the key ingredients needed to make a Tribe AI recruiter say “yes!” Get Jaclyn’s top tips for forward-thinking AI talent, the skills you need to learn, and the in-demand roles on Tribe’s list of clients.Additional materials: www.superdatascience.com/656Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Feb 21, 2023 • 1h 43min
655: AI ROI: How to get a profitable return on an AI-project investment
Transparent data science, profitable AI, and what’s missing from a data science education: Pandata’s Data Scientist in Residence Keith McCormick and Jon Krohn discuss how “insights” can never be the end product of a data science project, how to ensure you have a specific goal at the start of a project that is related to revenue, and why there is so much miscommunication between data scientists and their clients. Exclude the C-suite at your peril!This episode is brought to you by Glean (glean.io), the platform for data insights, fast. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What an Executive Data Scientist in Residence is [05:27]• What A.I. transparency is and how it relates to the field of Explainable A.I. (XAI) [17:34]• How companies can ensure they profit from AI projects [36:47]• Possible organization structures for data science teams to be profitable [1:02:41]• The current gaps in data science education [1:09:58]Additional materials: www.superdatascience.com/655

Feb 17, 2023 • 45min
654: Mike Wimmer: The 14-Year-Old A.I. Entrepreneur
14-year-old AI prodigy Mike Wimmer joins Jon Krohn to discuss his latest projects. Whether he's using AI to help conserve the world's coral reefs or launching his new IOT-based company, Mike is an endless source of inspiration in the field of AI.Additional materials: www.superdatascience.com/654Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.