

Super Data Science: ML & AI Podcast with Jon Krohn
Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Episodes
Mentioned books

Aug 2, 2024 • 19min
806: Llama 3.1 405B: The First Open-Source Frontier LLM
Llama 3.1 is here, and it’s a game-changer. Meta’s latest AI model, especially the massive 405B variant, finally brings an open-source option to compete with giants like OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet. While Meta didn’t fully open-source everything, the availability of "open weights" is a strategic move to shake up the AI landscape. The model boasts an impressive 128,000-token context window and multilingual support in eight languages. Meta is also focusing on responsible AI development with tools like Llama Guard 3 for content moderation. This release is more than just a tech upgrade—it's about democratizing AI and sparking innovation across industries. How will you leverage Llama 3.1 to make a real impact? Tune into this week’s FMF episode and let’s explore the future with this latest AI development together.Additional materials: www.superdatascience.com/806Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jul 30, 2024 • 55min
805: How to Be a Supercommunicator, with Charles Duhigg
Become a Supercommunicator! New York Times bestselling author Charles Duhigg, known for The Power of Habit and Smarter Faster Better, gets real about mastering communication in this episode. Discover insights from his latest book, Supercommunicator, where he reveals how to align conversation styles for deeper connections, handle conflicts effectively, and why AI can't replicate the emotional depth of human interactions.This episode is brought to you by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• The inspirations behind Supercommunicator [03:41]• The three types of conversations: Practical, emotional, and social conversations [05:22]• The matching principle: Align communication styles for better connection [10:36]• What is neural entrainment: Achieve a mind meld through synchronized brain activity [13:22]• The series of steps/principles to connect with someone [24:39]• How to avoid or de-escalate conflict conversations [31:07]• The impact of GenAI on conversations: How AI mimics dialogue but lacks emotional depth [45:24]Additional materials: www.superdatascience.com/805

Jul 26, 2024 • 14min
804: AI x Solar Power = Abundant Energy
Solar power now provides 6% of the world's electricity, thanks to rapid growth. Host Jon Krohn discusses the factors driving this rise, the challenges ahead, and how AI and data science are optimizing solar technologies. Tune in for insights on the future of solar power, and don't forget to like, share, and subscribe!Additional materials: www.superdatascience.com/804Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jul 23, 2024 • 1h 55min
803: How to Thrive in Your (Data Science) Career, with Daliana Liu
Daliana Liu is a big name in data science teaching, and she has always been generous in sharing everything she knows about getting a job in data science. In this episode, she continues to extend her generosity, helping listeners define their approach to achieving a fulfilling career in data science and tech.This episode is brought to you by AWS Inferentia and AWS Trainium, by Babbel, the science-backed language-learning platform, and by Gurobi, the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• Common career challenges for data scientists [34:57]• Advice for people who don’t know where to go in their career [48:05]• How to build resilience and protect against Imposter Syndrome [1:06:23]• Skills that data scientists should develop today [1:39:17]• The future of the data science and AI job market [1:46:55]Additional materials: www.superdatascience.com/803

Jul 19, 2024 • 24min
802: In Case You Missed It in June 2024
How to grab investor interest with your AI startup idea, revisiting algorithms, and helping practitioners ensure AI safety with regulatory frameworks and beyond: This month, you missed a whole bunch of great interviews. But don’t worry, Jon Krohn is here to recap all the best bits for you!Additional materials: www.superdatascience.com/802Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jul 16, 2024 • 1h 17min
801: Merged LLMs Are Smaller And More Capable, with Arcee AI's Mark McQuade and Charles Goddard
Merged LLMs are the future, and we’re exploring how with Mark McQuade and Charles Goddard from Arcee AI on this episode with Jon Krohn. Learn how to combine multiple LLMs without adding bulk, train more efficiently, and dive into different expert approaches. Discover how smaller models can outperform larger ones and leverage open-source projects for big enterprise wins. This episode is packed with must-know insights for data scientists and ML engineers. Don’t miss out!Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• Explanation of Charles' job title: Chief of Frontier Research [03:31]• Model Merging Technology combining multiple LLMs without increasing size [04:43]• Using MergeKit for model merging [14:49]• Evolutionary Model Merging using evolutionary algorithms [22:55]• Commercial applications and success stories [28:10]• Comparison of Mixture of Experts (MoE) vs. Mixture of Agents [37:57]• Spectrum Project for efficient training by targeting specific modules [54:28]• Future of Small Language Models (SLMs) and their advantages [01:01:22]Additional materials: www.superdatascience.com/801

Jul 12, 2024 • 44min
800: A Transformative Century of Technological Progress, with Annie P.
The SuperDataScience Podcast is celebrating its 800th episode! Host Jon Krohn speaks to his grandmother, Annie, about growing up at a time when so many technologies we take for granted today were yet to be developed. Listen in to hear Annie’s experience of the changes in technology across 94 years and how she and her family fared in 1940s Ukraine with no electricity or running water.Additional materials: www.superdatascience.com/800

Jul 9, 2024 • 1h 46min
799: AGI Could Be Near: Dystopian and Utopian Implications, with Dr. Andrey Kurenkov
No-code games with GenAI, the creative possibilities of LLMs, and our proximity to AGI: In this episode, Jon Krohn talks to Andrey Kurenkov about what turned him from an AGI skeptic to a positivist. You’ll also hear about his wildly popular podcast “Last Week in AI” and how the NVIDIA-backed startup Astrocade is helping videogame enthusiasts to create their own games through generative AI. A must-listen!This episode is brought to you by AWS Inferentia and AWS Trainium. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• All about The Gradient and Last Week in AI [10:42]• All about Astrocade and Andrey’s role at the startup [24:35]• Balancing UX and creative control at Astrocade [42:00]• The creative possibilities of LLMs [1:04:15]• The rapid emergence of AGI [1:10:31]Additional materials: www.superdatascience.com/799

Jul 5, 2024 • 15min
798: Claude 3.5 Sonnet: Frontier Capabilities & Slick New "Artifacts" UI
Claude 3.5 Sonnet, Anthropic’s newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it’s twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI.Additional materials: www.superdatascience.com/798Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jul 2, 2024 • 1h 10min
797: Deep Learning Classics and Trends, with Dr. Rosanne Liu
Dr. Rosanne Liu, Research Scientist at Google DeepMind and co-founder of the ML Collective, shares her journey and the mission to democratize AI research. She explains her pioneering work on intrinsic dimensions in deep learning and the advantages of curiosity-driven research. Jon and Dr. Liu also explore the complexities of understanding powerful AI models, the specifics of character-aware text encoding, and the significant impact of diversity, equity, and inclusion in the ML community. With publications in NeurIPS, ICLR, ICML, and Science, Dr. Liu offers her expertise and vision for the future of machine learning.Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• How the ML Collective came about [03:31]• The concept of a failure CV [16:12]• ML Collective research topics [19:03]• How Dr. Liu's work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs [21:28]• The pros and cons of curiosity-driven vs. goal-driven ML research [29:08]• Discussion on Dr. Liu's research and papers [33:17]• Character-aware vs. character-blind text encoding [54:59]• The positive impacts of diversity, equity, and inclusion in the ML community [57:51]Additional materials: www.superdatascience.com/797