

Super Data Science: ML & AI Podcast with Jon Krohn
Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Episodes
Mentioned books

Feb 25, 2025 • 54min
865: How to Grow (and Sell) a Data Science Consultancy, with Cal Al-Dhubaib
Jon Krohn talks to Cal Al-Dhubaib about the extraordinary success of AI and machine learning solutions provider Pandata, his ironclad hack for any company to define their core values, and how to attract and secure loyal clients. Cal thinks tech professionals make two critical mistakes in their careers: The first is that they too-often enjoy being the gatekeepers of their work rather than educating their clients and coworkers as to the details of their projects and why it benefits the company. The second is that tech professionals don’t show vulnerability, whether that means not knowing a topic or not fully understanding how a business works. This issue, Cal says, can spell the difference between a startup’s success and failure. Learn how tech startups can make an ironclad strategy for their future in this episode of The SuperDataScience Podcast.This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:
(09:32) How to scale a successful data science consultancy
(22:25) How Pandata navigates highly regulated environments
(27:59) How to tackle tech illiteracy in business
(36:32) What skills Cals looks for in new hires
(35:56) How to sell on a tech company
Additional materials: www.superdatascience.com/865

Feb 21, 2025 • 8min
864: OpenAI’s o3-mini: SOTA reasoning and exponentially cheaper
Jon Krohn investigates OpenAI’s new release, o3-mini, in this five-minute Friday, where he walks through the reasoning model’s capabilities and performance, cross-examining them against other major-league players, DeepSeek-R1, GPT-4o and Claude 3.5 Sonnet.Additional materials: www.superdatascience.com/864Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Feb 18, 2025 • 1h 6min
863: TabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter
Jon Krohn talks tabular data with Frank Hutter, Professor of Artificial Intelligence at Universität Freiburg in Germany. Despite the great steps that deep learning has made in analysing images, audio, and natural language, tabular data has remained its insurmountable obstacle. In this episode, Frank Hutter details the path he has found around this obstacle even with limited data by using a ground-breaking transformer architecture. Named TabPFN, this approach is vastly outperforming other architectures, as testified by a write up of TabPFN’s capabilities in Nature. Frank talks about his work on version 2 of TabPFN, the architecture’s cross-industry applicability, and how TabPFN is able to return accurate results with synthetic data.This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:
(05:57) All about the TabPFN architecture
(21:27) Use cases for Bayesian inference
(35:07) On getting published in Nature
(44:03) How TabPFN handles time series data
(51:52) All about Prior Labs
Additional materials: www.superdatascience.com/863

Feb 14, 2025 • 32min
862: In Case You Missed It in January 2025
In this episode of “In Case You Missed It”, Jon Krohn shares his favorite clips from the last four weeks. He talks to Azeem Azhar, Florian Neukart, Kirill Eremenko, Hadelin de Ponteves, and Brooke Hopkins on what’s in store for AI in 2025, from quantum computing and customizable tools to handy checklists and how the mathematics of exponentials can help us keep our heads about the swift advancement of AI.Additional materials: www.superdatascience.com/862Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Feb 11, 2025 • 2h 1min
861: From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey
How does a CrossFit winner, bobsledder and swimmer go on to have a glittering career in data analytics and engineering? Colleen Fotsch talks to Jon Krohn about transitioning into very different career paths, how sports gave her the competitive mindset she needed for success in data science, and seeing the niche role of analytics engineering as a bridge between data engineering and analysis.Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:
(05:49) Colleen’s path from athlete to data analyst
(1:14:41) About the data build tool (DBT)
(1:22:51) Colleen’s work at CHG Healthcare
(1:32:45) How Colleen and Tia-Clair got started with PRVN GO
Additional materials: www.superdatascience.com/861

Feb 7, 2025 • 13min
860: DeepSeek R1: SOTA Reasoning at 1% of the Cost
DeepSeek-curious? This Five-Minute Friday is for you! Jon Krohn investigates the overwhelming overnight success of this new LLM, the product of a Chinese hedge fund. DeepSeek is a market newcomer, and yet it runs shoulder to shoulder with behemoths from OpenAI, Anthropic and Google like it’s all in a day’s work.Additional materials: www.superdatascience.com/860Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Feb 4, 2025 • 60min
859: BAML: The Programming Language for AI, with Vaibhav Gupta
In this week’s guest interview, Vaibhav Gupta talks to Jon Krohn about creating a programming language, BAML, that helps companies save up to 30% on their AI costs. He explains how he started tailoring BAML to facilitate natural language generation interactions with AI models, how BAML helps companies optimize their outputs, and he also lets listeners into Boundary’s hiring process.This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:
(04:53) What BAML stands for
(14:33) Making a prompt engineering a serious practice
(18:00) How BAML helps companies
(23:30) Using retrieval-augmented generation (RAG)
(43:09) How to get a job at Boundary
Additional materials: www.superdatascience.com/859

Jan 31, 2025 • 7min
858: Are You The Account Executive We’re Looking For?
Are you an Account Executive with experience in the technology sector? In this Five-Minute Friday, Jon Krohn tells listeners about an exciting new role that has opened up at The SuperDataScience Podcast.Additional materials: www.superdatascience.com/858Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jan 28, 2025 • 1h 23min
857: How to Ensure AI Agents Are Accurate and Reliable, with Brooke Hopkins
Brooke Hopkins speaks to Jon Krohn about technology’s new frontiers in AI agents, how these agents will impact society, work and our creative enterprises, and what this might mean for our data-driven future. You will learn how Coval, a simulation and evaluation platform for AI voice and chat agents, helps companies balance precision and scalability while making few concessions on the way. This episode is brought to you by ODSC, the Open Data Science Conference. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:
(07:49) What Coval does and how the platform works
(21:16) Coval’s workflows
(37:40) The future of AI agents
(46:28) The metrics to evaluate performance
(55:08) How close we are to achieving AI agent autonomy
Additional materials: www.superdatascience.com/857

Jan 24, 2025 • 10min
856: The Fastest-Growing Jobs Are AI Jobs
Get excited: The fastest-growing jobs in the US are AI Engineer and AI Consultant. In this Five-Minute Friday, Jon Krohn looks into the reports that reveal this job growth, and the trends any data scientist and AI professional will want to watch in 2025.Additional materials: www.superdatascience.com/856Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.