Super Data Science: ML & AI Podcast with Jon Krohn

Jon Krohn
undefined
Aug 19, 2025 • 1h 10min

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and how tech teams can handle other key technical terms like data poisoning, prompt stealing, jailbreaking and slop squatting.  This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS⁠ and by the ⁠Dell AI Factory with NVIDIA⁠. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/915⁠⁠⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:31) What “trustworthy AI” means      (31:15) How to build trustworthy AI systems  (46:55) About Michelle’s “sorry bench”   (48:13) How LLMs help construct causal graphs   (51:45) About Generationship 
undefined
Aug 15, 2025 • 26min

914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz

In this Five-Minute Friday, Cofounder and CTO of lakeFS Oz Katz talks to Jon Krohn about data warehouses, data lakes, and how companies can handle increasingly complex data infrastructures and formats. Hear about lakeFS’s collaboration with Legofest, lakeFS’s approach to helping users collaborate on data lakes, and how to overcome the challenges of working with multimodal data. Additional materials: ⁠www.superdatascience.com/914⁠ This episode is brought to you by the ⁠Dell AI Factory with NVIDIA⁠.
undefined
Aug 12, 2025 • 1h 15min

913: LLM Pre-Training and Post-Training 101, with Julien Launay

Julien Launay launched Adaptive to give data science teams in business enterprises their “RLOps tooling” to make reinforcement learning easier. Talking to Jon Krohn, Julien says, “Most of our users are data scientists who write Python codes to interface with the system”. Adaptive is also able to work with companies without data science teams, collaborating with partners like Deloitte to add the necessary personnel. Julien is currently working on making his platform more widely available. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/913⁠⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Aug 8, 2025 • 33min

912: In Case You Missed It in July 2025

In this episode of In Case You Missed It, we look back on five great interview episodes from July. Hear from Lilith Bat-Leah (Episode 901), Sinan Ozdemir (Episode 903), Sebastian Gehrmann (Episode 905), Zohar Bronfman (Episode 907) and Robert Ness (Episode 909). They’ll tell you why data-centric machine learning is so important across disciplines, starting with law, and how we can use AI benchmarks and “red teaming” to refine our search for the best AI models.  Additional materials: ⁠⁠⁠⁠www.superdatascience.com/912 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Aug 5, 2025 • 58min

911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal

Reproducibility, Python notebooks, and data science communities: Software developer Akshay Agrawal speaks to Jon Krohn about Marimo, the next-generation computational notebook for Python, how he built and fostered a thriving community around the product, and what makes this notebook so versatile and accessible for users.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/911⁠⁠⁠⁠⁠ This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS ⁠and by the ⁠Dell AI Factory with NVIDIA⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Aug 1, 2025 • 10min

910: AI is Disrupting Journalism: The Good, The Bad and The Opportunity

In this Five-Minute Friday, Jon Krohn looks into AI’s disruption of the journalism industry and how it has fundamentally reshaped news production. Multiple news outlets’ suing of ChatGPT over its use of copyrighted materials may have taken the most headlines to date, but this isn’t to say news media is rebuffing AI entirely. On the contrary, several outlets have launched summarization and analysis tools for both internal and external use, such as The New York Times’s Echo and The Washington Post’s Haystacker. This episode looks into the ways major news outlets are utilising AI, and what this means for journalists. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/910⁠⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Jul 29, 2025 • 1h 22min

909: Causal AI, with Dr. Robert Usazuwa Ness

Researcher at Microsoft Robert Usazuwa Ness talks to Jon Krohn about how to achieve causality in AI with correlation-based learning, the right libraries, and handling statistical inference. When dealing with causal AI, Robert notes how important it is to keep aware of variables in the data that may mislead us and force inaccurate assumptions. Not all variables will be useful. It is essential, then, that any assumptions are grounded in a deeper understanding of how the data were gathered, and not what appears in the dataset. Listen to the episode to hear how you can apply causal AI to your projects. Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠⁠ This episode is brought to you by Trainium2, the latest AI chip from AWS and by the Dell AI Factory with NVIDIA. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Jul 25, 2025 • 9min

908: AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)

The moral and ethical implications of letting AI take the wheel in business, as revealed by Anthropic: Jon Krohn looks into Anthropic’s latest research on how to use and deploy LLMs safely, specifically in business environments. The team designed scenarios to test the behavior of AI agents when given a goal and a set of obstacles to reach it. Those obstacles included 1) threats to the AI’s continued operation, and 2) conflict between the AI’s goals and the goals of the company. Hear Jon break down the results of this research in this Five-Minute Friday. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/908⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
undefined
Jul 22, 2025 • 1h 21min

907: Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman

“Intelligence has many forms,” says Zohar Bronfman, who speaks with Jon Krohn about the fascinating intersection between computational neuroscience and philosophy, and how it has brought him closer to understanding what is necessary to develop human-like intelligence in machines, as well as his motivations for launching Pecan AI and why predictive models outstrip generative models in business.  Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/907⁠⁠⁠ This episode is brought to you⁠⁠⁠ by, ⁠⁠⁠⁠Adverity, the conversational analytics platform⁠⁠⁠⁠ and by the ⁠⁠⁠⁠Dell AI Factory with NVIDIA⁠⁠⁠⁠. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: (03:47) Why LLMs aren’t bringing us closer to AGI (33:44) About Pecan AI (51:03) Why data modeling is so challenging (1:01:25) How Pecan AI makes its tools widely accessible
undefined
Jul 18, 2025 • 29min

906: How Prof. Jason Corso Solved Computer Vision’s Data Problem

Jason Corso speaks to Jon Krohn in this Five-Minute Friday all about Voxel51’s latest tool, Verified Auto-Labelling, and the company’s incredible success in developing popular tools for computer vision. Additional materials: ⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/906⁠ Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app