
Super Data Science: ML & AI Podcast with Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Latest episodes

Sep 8, 2023 • 7min
712: Code Llama
Code Llama might just be starting the revolution for how data scientists code. In this Five-Minute Friday, host Jon Krohn investigates the suite of models under the free-to-use Code Llama and how to find the best fit for your project’s needs.Additional materials: www.superdatascience.com/712Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Sep 5, 2023 • 1h 26min
711: Image, Video and 3D-Model Generation from Natural Language, with Dr. Ajay Jain
In this episode, host Jon Krohn explores with his guest Ajay Jain, Co-Founder of Genmo.ai, how creative general intelligence could take the video industry by storm. They also discuss the models that got Genmo to this point, the applications of NeRF, and how understanding human psychology is so essential to developing models that output high-fidelity video.This episode is brought to you by the Zerve data science dev environment, by Grafbase, the unified data layer, and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• About Genmo.ai and the term “creative general intelligence” [03:47]• Why Ajay started Genmo.ai [09:26]• The increased performance of multimodal models [21:12]• All about Denoising Diffusion Probabilistic Models (DDPMs) [31:03]• The application of Neural Radiance Fields (NeRF) [55:26]• Predicting pedestrian behavior at Uber [1:01:50]• How to save money in the process of training models [1:12:42]Additional materials: www.superdatascience.com/711

Sep 1, 2023 • 1h 3min
710: LangChain: Create LLM Applications Easily in Python
Discover the power of Large Language Models with Kris Ograbek as he unravels the intricacies of LangChain and showcases a chatbot in action, all while putting our host Jon Krohn in the hot seat!Additional materials: www.superdatascience.com/710Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Aug 29, 2023 • 1h 21min
709: Big A.I. R&D Risks Reap Big Societal Rewards, with Meta's Dr. Laurens van der Maaten
Meta's Senior Research Director, Dr. Laurens van der Maaten, takes center stage to unravel the captivating realm of AI innovation. Learn about his groundbreaking contributions, including pioneering the t-SNE dimensionality reduction technique and harnessing AI for novel protein synthesis, climate change mitigation, and wearable materials simulation. Join us to explore the transformative power of AI across diverse domains and gain a glimpse into its future societal implications.This episode is brought to you by AWS Inferentia, by Modelbit, for deploying models in seconds, and by Grafbase, the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Large-scale learning of image recognition models on web data [05:05]• Evolutionary Scale Modeling protein models [16:45]• Fighting climate change by building an A.I. model [29:49]• The CrypTen privacy-preserving ML framework [38:36]• Concerns about adversarial examples [53:25]• Laurens’ t-SNE algorithm [58:56]• How to make a big impact [1:07:25]Additional materials: www.superdatascience.com/709

Aug 25, 2023 • 23min
708: ChatGPT Code Interpreter: 5 Hacks for Data Scientists
On this week’s Five-Minute Friday, host Jon Krohn gives five reasons why he is so excited about ChatGPT’s Code Interpreter and walks listeners through its capabilities with a practical example.Additional materials: www.superdatascience.com/708Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Aug 22, 2023 • 1h 47min
707: Vicuña, Gorilla, Chatbot Arena and Socially Beneficial LLMs, with Prof. Joey Gonzalez
LLM Vicuña, Chatbot Arena, and the race to increase LLM context windows: This episode’s guest Joey Gonzalez talks to Jon Krohn about developing models and platforms that leverage and improve LLMs, as well as the future of AI development and access.This episode is brought to you by the AWS Insiders Podcast, by Modelbit, for deploying models in seconds, and by Grafbase, the unified data layer. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• Vicuña: How the revolutionary LLM came to be [03:35]• Chatbot Arena: The leading LLM leaderboard [09:47]• Trusting LLM results [17:54]• Gorilla: The open-source ChatGPT plugin alternative [32:13]• About LMSYS and long context windows [47:48]• Open- vs closed-source LLMs: Which is better? [1:01:39]• Aqueduct [1:16:49]• Founding GraphLab [1:27:02]• How AI will positively impact society in the coming decades [1:32:31]Additional materials: www.superdatascience.com/707

Aug 18, 2023 • 33min
706: Large Language Model Leaderboards and Benchmarks
In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Aug 15, 2023 • 1h 29min
705: Feeding the World with ML-Powered Precision Agriculture
Join Jon Krohn as he chats with Syngenta Group's Feroz Sheikh, Jeremy Groeteke, and Thomas Jung about the digital revolution in agriculture. Learn how data science is evolving farming, from precision techniques to global food solutions. A compelling blend of tech meets nature.This episode is brought to you by AWS Inferentia and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• What is precision agriculture? [09:43]• What is computational agronomy? [12:30]• How Syngenta helps growers optimize yields [21:37]• How to bridge the gap between R&D and out in the real world [33:58]• What is generative chemistry? [37:52]• How generative chemistry accelerates the discovery of new compounds [41:55]• How you could make a big social impact in agriculture with data science [56:22]• How to go about designing ML models for agriculture [1:00:27]Additional materials: www.superdatascience.com/705

Aug 11, 2023 • 5min
704: Jon’s “Generative A.I. with LLMs” Hands-on Training
Take on the world of GPT and learn to develop your own, commercially successful Large Language Models (LLMs) with Jon Krohn’s comprehensive, guided training video for generative AI. Get to grips with the technology, learn which tools to use, and find out how to get an eye for business-viable models with Jon’s (ad-)free educational video.Additional materials: www.superdatascience.com/704Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

Aug 8, 2023 • 1h 9min
703: How Data Happened: A History, with Columbia Prof. Chris Wiggins
Statistics history, interdisciplinarity, and data and society. Chris Wiggins talks with Jon Krohn about the power dynamics of data, the transformation of the field of biology through data-driven approaches to genetic sequencing, and the New York Times’ data science team’s cutting-edge approach to accommodating its tech stack.This episode is brought to you by the AWS Insiders Podcast and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.In this episode you will learn:• The importance of the humanities in data science [09:18]• How data science “rearranges” power [17:19]• An overview of How Data Happened [20:36]• The controversial nature of Bayes theorem [29:16]• Why we need to consider data ethics [34:00]• How biology came to adopt data science into its field [45:44]• The data science tech stack at the New York Times [49:18]Additional materials: www.superdatascience.com/703