
Super Data Science: ML & AI Podcast with Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Latest episodes

19 snips
Oct 22, 2024 • 1h 37min
829: Neuroscience Fueled by ML, with Prof. Bradley Voytek
In this engaging conversation, neurosurgeon Bradley Voytek, a professor at UC San Diego and former data scientist at Uber, explores the fascinating intersection of neuroscience and machine learning. He shares recent breakthroughs in brain communication and the potential of brain rhythms as diagnostic tools. The discussion also highlights essential skills for aspiring data scientists, the impact of collaborative research in neuroscience, and insights from his time developing algorithms at Uber as a startup. It's a captivating look at the future of both fields.

33 snips
Oct 18, 2024 • 20min
828: Are “Citizen Data Scientists” A Myth? With Keith McCormick
In this engaging conversation, Keith McCormick, Data Science Principal at Further and an expert in online education, tackles the concept of 'citizen data scientists.' He explores the impact of low-code and no-code tools on data science, emphasizing how they can empower professionals and foster collaboration with experienced data scientists. Keith also discusses the role of AutoML, clarifying its limitations and how it can complement traditional data science, rather than replace it. Tune in for insights on enhancing data governance and boosting project success!

30 snips
Oct 15, 2024 • 1h 14min
827: Polars: Past, Present and Future, with Polars Creator Ritchie Vink
Ritchie Vink, CEO and Co-Founder of Polars, Inc., is the creator of the Polars open-source data manipulation library. He shares insights into the impressive efficiency of Polars compared to traditional tools like Pandas. The conversation dives into the difference between eager and lazy execution modes, scalability for large datasets, and upcoming features like Polars Cloud. Ritchie also discusses the balance between maintaining open-source principles and expanding the company, teasing new functionalities that aim to refine data handling capabilities.

Oct 11, 2024 • 42min
826: In Case You Missed It in September 2024
Julia Silge, Engineering Manager at Posit, shares insights on the development of Positron, an IDE designed specifically for data scientists' unique coding needs. Luca Anichin offers tips on enhancing machine learning models in PyTorch, stressing the balance between model and data. Marco Garelli discusses Polars, an open-source library that significantly speeds up data manipulation compared to Pandas. Mark Weissman highlights essential traits for data scientist hiring, advocating for practical skills over traditional qualifications.

67 snips
Oct 8, 2024 • 1h 2min
825: Data Contracts: The Key to Data Quality, with Chad Sanderson
Chad Sanderson, CEO of Gable.ai and an expert in data quality and governance, shares insights on the transformative power of data contracts in modern data management. He explains how these contracts clarify expectations for data quality and promote better alignment between data producers and consumers. The conversation dives into 'shifting left' practices that tackle problems early, address concerns about data debt, and the crucial role of human oversight. Chad also highlights storytelling as a tool for data teams to enhance communication and effectiveness.

10 snips
Oct 4, 2024 • 14min
824: Llama 3.2: Open-Source Edge and Multimodal LLMs
Discover the groundbreaking features of Llama 3.2, designed for lightweight on-device applications and advanced image processing. Explore how this release revolutionizes open-source AI, making it more accessible for developers. Uncover the Llama Stack toolkit aimed at simplifying deployment, and learn about Llama Guard 3, a new content moderation solution. With strong backing from cloud and hardware partners, Llama 3.2 is poised to redefine AI capabilities across mobile platforms and beyond.

15 snips
Oct 1, 2024 • 1h 21min
823: Virtual Humans and AI Clones, with Natalie Monbiot
Natalie Monbiot, Head of Strategy at Hour One, discusses the revolutionary impact of AI avatars on industries like L&D and e-commerce. She unpacks how virtual humans are reshaping digital communication and fostering real connections post-COVID. The conversation explores the ethical challenges of creating these avatars, transparency issues, and the virtual human economy. Natalie also highlights advancements in AI that enhance content engagement and emphasizes the importance of authenticity in AI-generated interactions. Get ready to rethink virtual communication!

7 snips
Sep 27, 2024 • 19min
822: NotebookLM: Jaw-Dropping Podcast Episodes Generated About Your Documents
Discover how NotebookLM, Google’s latest AI tool, can transform massive documents into engaging audio content. Jon Krohn shares his experience turning a 200-page dissertation into an 11-minute podcast. Unpack the fascinating interplay between genetics and anxiety, exploring how mice studies reveal surprising similarities to humans. Dive into innovative research methodologies that link genetic diversity and environment to anxiety, while contemplating the ethical implications of gene therapy. This is a journey through the future of content creation and mental health.

25 snips
Sep 24, 2024 • 1h 13min
821: The Skills You Need to Be an Effective Data Scientist, with Marck Vaisman
Marck Vaisman, a Senior Cloud Solutions Architect at Microsoft and adjunct professor, shares his insights on effective data science roles. He reveals four key data practitioner personas and critiques common career roadmaps that often fall short. Discussing the elusive nature of the “data scientist” title, Marck emphasizes the importance of clarity in roles and skills. He highlights the essential blend of technical know-how and soft skills necessary for success, alongside the increasing significance of community and generative AI tools in the field.

25 snips
Sep 20, 2024 • 27min
820: OpenAI's o1 "Strawberry" Models
Explore the groundbreaking capabilities of OpenAI's latest o1 'Strawberry' models. Discover how these models revolutionize AI with advanced reasoning skills, mirroring human thought processes. Delve into their strengths and limitations as they signify a potential turning point in generative AI technology. Gain insight into the future implications of these models, especially in relation to the concept of singularity.