
Super Data Science: ML & AI Podcast with Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Latest episodes

8 snips
Mar 7, 2025 • 27min
868: In Case You Missed It in February 2025
Colleen Fotch, a pro-athlete turned data engineer, dives into essential tools like DBT for data management. She explains how DBT simplifies data modeling and automates processes, making life easier for data engineers. The conversation also touches on generative AI's role in fitness and data, highlighting its benefits for collaboration. Fotch discusses the innovative BAML programming language aimed at novice coders and explores the impactful applications of TabPFN in science and medicine, keeping listeners engaged with the evolving data landscape.

113 snips
Mar 4, 2025 • 1h 33min
867: LLMs and Agents Are Overhyped, with Dr. Andriy Burkov
Dr. Andriy Burkov, a best-selling author and AI influencer, shares his insights on the future of AI, particularly questioning the hype around AI agents and large language models. He discusses innovative chatbot designs that avoid common pitfalls like hallucination. Burkov also reflects on the journey of language modeling, the evolution of natural language processing, and how Talent Neuron leverages data to transform talent management. He emphasizes the gap between human cognitive abilities and AI, stressing the skepticism around the effectiveness of AI in real-world applications.

10 snips
Feb 28, 2025 • 8min
866: Bringing Back Extinct Animals like the Woolly Mammoth and Dodo Bird
The podcast dives into the fascinating realm of bringing extinct animals back to life, focusing on the woolly mammoth and dodo bird. It highlights the groundbreaking work of Colossal Biosciences and their use of genomic technologies, like CRISPR and artificial wombs. Discussions also touch on the implications for conservation, biodiversity credits, and the ethical aspects of playing god with nature. Can science really resurrect history? The conversation reveals both excitement and caution surrounding these ambitious biotech projects.

61 snips
Feb 25, 2025 • 54min
865: How to Grow (and Sell) a Data Science Consultancy, with Cal Al-Dhubaib
In this conversation, Cal Al-Dhubaib, Head of AI and Data Science at Further and former CEO of Pandata, shares his journey of building a data science consultancy. He discusses how to scale businesses by creating strong core values and effectively engaging clients. Cal highlights the importance of educating clients rather than gatekeeping information. He also addresses common pitfalls tech professionals face, such as not showing vulnerability and the need for both technical and soft skills in hires. His insights provide invaluable strategies for thriving in the data science sector.

21 snips
Feb 21, 2025 • 8min
864: OpenAI’s o3-mini: SOTA reasoning and exponentially cheaper
Dive into the fascinating world of OpenAI’s o3-mini, showcasing its cutting-edge reasoning capabilities. Discover how it stacks up against top competitors like DeepSeek-R1, GPT-4o, and Claude 3.5 Sonnet. The discussion highlights significant enhancements in logical processing and the novel access through ChatGPT. All of this is packed into a quick yet insightful review that's perfect for tech enthusiasts!

52 snips
Feb 18, 2025 • 1h 6min
863: TabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter
In this engaging discussion, Professor Frank Hutter, an AI expert from Universität Freiburg and co-founder of Prior Labs, unveils his groundbreaking TabPFN architecture designed for tabular data. He explains how this innovative model outperforms traditional methods, even with limited datasets, and shares its exciting applications across various sectors like healthcare and finance. Frank also dives into the role of Bayesian inference, synthetic data, and the impressive capabilities of TabPFN in handling time series analysis, showcasing advancements that could revolutionize predictive modeling.

17 snips
Feb 14, 2025 • 32min
862: In Case You Missed It in January 2025
Florian Neukart, a quantum computing expert, discusses how this cutting-edge technology can revolutionize optimization across various fields. Brooke Hopkins, an engineer and entrepreneur, shares insights on her company Coval, which helps users assess AI agents using tailored metrics. They explore the challenges of understanding exponential growth in AI and the intricacies of selecting foundation models. Their conversation also delves into innovative ways to evaluate conversational agents, emphasizing the importance of context and dynamic performance metrics.

49 snips
Feb 11, 2025 • 2h 1min
861: From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey
Colleen Fotsch, a former professional athlete turned Data Platform Senior Technical Manager at CHG Healthcare, shares her inspiring transition from swimming and CrossFit to the data analytics field. She discusses how her competitive mindset shaped her success, her innovative program merging fitness with data analytics for busy professionals, and the importance of mentorship in her career journey. Colleen also delves into the role of analytics engineering, emphasizing its significance as a bridge between data engineering and analysis.

35 snips
Feb 7, 2025 • 13min
860: DeepSeek R1: SOTA Reasoning at 1% of the Cost
Curious about cut-rate AI breakthroughs? Discover the impressive rise of DeepSeek's R1 model, a newcomer shaking up the market alongside giants like OpenAI and Google. Learn how this Chinese innovation efficiently competes at a fraction of the cost, prompting discussions on global tech dynamics. Dive into the $500 billion Stargate AI initiative and its seismic industry impacts, while also uncovering advancements in sustainable LLM training that aim for fair access in the AI landscape.

37 snips
Feb 4, 2025 • 60min
859: BAML: The Programming Language for AI, with Vaibhav Gupta
Vaibhav Gupta, Founder and CEO of Boundary, discusses the revolutionary BAML programming language designed to slash AI costs by up to 30%. He shares insights on natural language generation and how BAML streamlines AI interactions, enhancing data clarity. Gupta compares prompt engineering to traditional engineering, emphasizing its growing importance. Additionally, he reveals his unique hiring process, which prioritizes communication skills. Listeners will also learn about the use of retrieval-augmented generation technology and the future potential of BAML.