

Super Data Science: ML & AI Podcast with Jon Krohn
Jon Krohn
The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.
Episodes
Mentioned books

Jun 28, 2024 • 43min
796: Earth's Coming Population Collapse and How AI Can Help, with Simon Kuestenmacher
Want to feel optimistic about your day? In this Friday episode, Simon Kuestenmacher talks to Jon Krohn about demography: What it is, why it’s so important, and why its forecasts should give us reason to hope for a better future. In an increasingly globalized world, and with an aging population in countries with the biggest GDPs, demography is more valuable than ever.Additional materials: www.superdatascience.com/796Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jun 25, 2024 • 1h 8min
795: Fast-Evolving Data and AI Regulatory Frameworks, with Dr. Gina Guillaume-Joseph
Gina Guillaume-Joseph talks to Jon Krohn about the data and regulatory frameworks set to transform the AI industry and why that’s important to anyone working with data. This episode offers a solid path to understanding AI regulation’s past, present and future. Gina walks listeners through the AI Bill of Rights, the NIST AI Risk Framework and the MITRE ATLAS threat model.This episode is brought to you by AWS Inferentia and AWS Trainium, by Crawlbase, the ultimate data crawling platform, and by Babbel, the science-backed language-learning platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• What “responsible AI” means [08:14]• Why the federal government should be behind AI regulation [12:22]• The US vs EU on AI regulation [18:46]• About the AI Bill of Rights [26:14]• About MITRE and the MITRE Atlas [37:19]• What a systems engineer does [54:11]Additional materials: www.superdatascience.com/795

Jun 21, 2024 • 11min
794: Exciting (and Frightening!) Trends in Open-Source AI
Trends in open-source AI: Join Jon Krohn and a panel of data science icons as they discuss the most exciting and concerning developments in open-source AI. Hear insights from Drew Conway, Jared Lander, Emily Zabor, and JD Long on the transformative potential of AI and its future impact.Additional materials: www.superdatascience.com/794Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

Jun 18, 2024 • 1h 33min
793: Bayesian Methods and Applications, with Alexandre Andorra
Bayesian methods take the spotlight in this episode with Alex Andorra, co-founder of PyMC Labs, and Jon Krohn. Learn how Bayesian techniques handle tough problems, make the most of prior knowledge, and work wonders with limited data. Alex and Jon break down essentials like PyMC, PyStan, and NumPyro libraries, show how to boost model efficiency with PyTensor, and talk about using ArviZ for top-notch diagnostics and visualizations. Plus, get into advanced modeling with Gaussian Processes.This episode is brought to you by Crawlbase, the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.In this episode you will learn:• Practical introduction to Bayesian statistics [04:54]• Definition and significance of epistemology [17:52]• Explanation of PyMC and Monte Carlo methods [27:57]• How to get started with Bayesian modeling and PyMC [34:26]• PyMC Labs and its consulting services [50:50]• ArviZ for post-modeling diagnostics and visualization [01:02:23]• Gaussian processes and their applications [01:09:02]Additional materials: www.superdatascience.com/793

Jun 14, 2024 • 23min
792: In Case You Missed It in May 2024
Jon Krohn shares his favorite clips from May. Hear how Navdeep Martin is spearheading a company to tackle the climate crisis, why Sol Rashidi and Demetrios Brinkmann find nailing job titles so necessary in the fast-paced industries of tech and AI, and get the latest on embeddings with Luis Serrano.Additional materials: www.superdatascience.com/792Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

Jun 11, 2024 • 57min
791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education.This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Why it is important that AI is open [03:13]• The efficacy and scalability of direct preference optimization [07:32]• Robotics and LLMs [14:32]• The challenges to aligning reward models with human preferences [23:00]• How to make sure AI’s decision making on preferences reflect desirable behavior [28:52]• Why Nathan believes AI is closer to alchemy than science [37:38]Additional materials: www.superdatascience.com/791

Jun 7, 2024 • 7min
790: Open-Source Libraries for Data Science at the New York R Conference
The experts reveal their top open-source R libraries with us live from the New York R Conference! This Super Data Science Podcast episode features an exclusive panel with data science trailblazers Drew Conway, Jared Lander, Emily Zabor, and JD Long. They share their favorite R libraries and valuable insights to enhance your data science practice.Additional materials: www.superdatascience.com/790Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

Jun 4, 2024 • 1h 15min
789: ML for Wind-Powered Energy Generation, with Dr. Jason Yosinski
Machine Learning for Wind Energy is front and center in this episode as Jon Krohn is joined by Dr. Jason Yosinski, CEO of Windscape AI. Dr. Yosinski brings to light the latest ML advancements sparking significant changes in renewable energy. Tune in for a comprehensive review of these cutting-edge technologies and their expansive impact on the industry and the environment's well-being.This episode is brought to you by Crawlbase, the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• Enhancing predictability in wind energy with ML [04:52]• Data utilization from wind turbines by energy providers [11:41]• Jason's journey into wind energy [17:55]• Landing the right startup idea [22:47]• Visualizing neural networks with the Deep Vis Toolbox [31:29]• Extreme event forecasting at Uber vs. nowcasting at Windscape AI [45:13]• Discoveries from Loss Change Allocation research [47:48]• Engaging with Jason's ML Collective [59:46]• Traits of successful AI entrepreneurs [1:10:26]Additional materials: www.superdatascience.com/789

May 31, 2024 • 10min
788: Multi-Agent Systems: How Teams of LLMs Excel at Complex Tasks
Multi-agent systems could mark a significant turning point in generative AI. From mastering increasingly complex tasks to getting LLMs to collaborate, in this Five-Minute Friday, Jon Krohn discusses the systems that are working to bridge the remaining gaps left by the latest large language models (LLMs).Additional materials: www.superdatascience.com/788Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

May 28, 2024 • 56min
787: MLOps: The Job and The Key Tools, with Demetrios Brinkmann
MLOps, how to build an online community, and tools for scaling LLMs: In this episode, Demetrios Brinkmann speaks to Jon Krohn about the similarities and differences between LLMOps, MLOps and DevOps, and why this should matter to companies looking to hire such engineers. You will also hear how to get involved in the MLOps community wherever you are in the world, and how you can start developing great products with the available tools.This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.In this episode you will learn:• What MLOps is [03:51]• About LLMOps [12:06]• About LlamaIndex and Ollama [18:29]• Insights from Demetrios’ MLOps survey [20:49]• Guidance for using third-party APIs [40:18]• Recommendations for building an online community in tech and AI [47:07]Additional materials: www.superdatascience.com/787