
Vanishing Gradients
A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson.
It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll have an opportunity to learn from the experts. And if you've been around for a while, you'll find out what's happening in many other parts of the data world.
Latest episodes

15 snips
Nov 14, 2023 • 1h 8min
Episode 21: Deploying LLMs in Production: Lessons Learned
Guest Hamel Husain, a machine learning engineer, discusses the business value of large language models (LLMs) and generative AI. They cover common misconceptions, necessary skills, and techniques for working with LLMs. The podcast explores the challenges of working with ML software and chat GPT, the importance of data cleaning and analysis, and deploying LLMs in production with guardrails. They also discuss an AI-powered real estate CRM and optimizing marketing strategies through data analysis.

Oct 5, 2023 • 1h 27min
Episode 20: Data Science: Past, Present, and Future
Chris Wiggins, Chief data scientist for the New York Times, and Matthew Jones, professor of history at Princeton University, discuss their book on the history of data and its impact on society. They explore topics such as the use of data for decision making, the development of statistical techniques, the influence of Francis Galton on eugenics, and the rise of data, compute, and algorithms in various fields.

Aug 14, 2023 • 1h 23min
Episode 19: Privacy and Security in Data Science and Machine Learning
Hugo chats with Katharine Jarmul, a Principal Data Scientist at Thoughtworks Germany, specializing in privacy and ethics in data workflows. They dive into the vital distinctions between data privacy and security, demystifying common misconceptions. Katharine highlights the impact of GDPR and CCPA, and explores advanced concepts like federated learning and differential privacy. They also tackle real-world issues like privacy attacks and the ethical responsibilities of data scientists, making a compelling case for prioritizing privacy in data practices.

5 snips
May 24, 2023 • 1h 13min
Episode 18: Research Data Science in Biotech
Eric Ma, a leader in the research team at Moderna Therapeutics, discusses the tools and techniques used for drug discovery, the importance of machine learning and Bayesian inference, and the cultural questions surrounding hiring and management in research data science in biotech. They also explore the tech stack used in their work, the skills and hiring considerations in biotech, the importance of data testing and standardizing Excel spreadsheets, and the current state and challenges of Bayesian inference.

Feb 17, 2023 • 1h 16min
Episode 17: End-to-End Data Science
Hugo speaks with Tanya Cashorali, a data scientist and consultant that helps businesses get the most out of data, about what end-to-end data science looks like across many industries, such as retail, defense, biotech, and sports, including
scoping out projects,
figuring out the correct questions to ask,
how projects can change,
delivering on the promise,
the importance of rapid prototyping,
what it means to put models in production, and
how to measure success.
And much more, all the while grounding their conversation in real-world examples from data science, business, and life.
In a world where most organizations think they need AI and yet 10-15% of data science actually involves model building, it’s time to get real about how data science and machine learning actually deliver value!
LINKS
Tanya on Twitter
Vanishing Gradients on YouTube
Saving millions with a Shiny app | Data Science Hangout with Tanya Cashorali
Our next livestream: Research Data Science in Biotech with Eric Ma

13 snips
Dec 14, 2022 • 1h 23min
Episode 16: Data Science and Decision Making Under Uncertainty
JD Long, agricultural economist and quant, discusses decision making under uncertainty in data science, common mistakes, heuristics for decision-making, and the impact of cognitive biases. Topics include coupling data science with decision-making, model building, storytelling, and the intersection of cognitive biases.

4 snips
Dec 7, 2022 • 54min
Episode 15: Uncertainty, Risk, and Simulation in Data Science
Hugo speaks with JD Long, agricultural economist, quant, and stochastic modeler, about decision making under uncertainty and how we can use our knowledge of risk, uncertainty, probabilistic thinking, causal inference, and more to help us use data science and machine learning to make better decisions in an uncertain world.
This is part 1 of a two part conversation. In this, part 1, we discuss risk, uncertainty, probabilistic thinking, and simulation, all with a view towards improving decision making and we draw on examples from our personal lives, the pandemic, our jobs, the reinsurance space, and the corporate world. In part 2, we’ll get into the nitty gritty of decision making under uncertainty.
As JD says, and I paraphrase, “You may think you train your models, but your models are really training you.”
Links
Vanishing Gradients' new YouTube channel!
JD on twitter
Executive Data Science, episode 5 of Vanishing Gradients, in which Jim Savage and Hugo talk through decision making and why you should always be integrating your loss function over your posterior
Fooled by Randomness by Nassim Taleb
Superforecasting: The Art and Science of Prediction Philip E. Tetlock and Dan Gardner
Thinking in Bets by Annie Duke
The Signal and the Noise: Why So Many Predictions Fail by Nate Silver
Thinking, Fast and Slow by Daniel Kahneman

10 snips
Nov 20, 2022 • 1h 9min
Episode 14: Decision Science, MLOps, and Machine Learning Everywhere
Hugo Bowne-Anderson discusses decision science, MLOps, and the ubiquity of machine learning models. Topics include decision-making under uncertainty, biases in data collection, MLOps and DevOps convergence, digital feedback loops, Google's search evolution, and the impact of modern algorithms on reality perception.

Oct 11, 2022 • 1h 23min
Episode 13: The Data Science Skills Gap, Economics, and Public Health
Hugo speak with Norma Padron about data science education and continuous learning for people working in healthcare, broadly construed, along with how we can think about the democratization of data science skills more generally.
Norma is CEO of EmpiricaLab, where her team‘s mission is to bridge work and training and empower healthcare teams to focus on what they care about the most: patient care. In a word, EmpiricaLab is a platform focused on peer learning and last-mile training for healthcare teams.
As you’ll discover, Norma’s background is fascinating: with a Ph.D. in health policy and management from Yale University, a master's degree in economics from Duke University (among other things), and then working with multiple early stage digital health companies to accelerate their growth and scale, this is a wide ranging conversation about how and where learning actually occurs, particularly with respect to data science; we talk about how the worlds of economics and econometrics, including causal inference, can be used to make data science and more robust and less fragile field, and why these disciplines are essential to both public and health policy. It was really invigorating to talk about the data skills gaps that exists in organizations and how Norma’s team at Empiricalab is thinking about solving it in the health space using a 3 tiered solution of content creation, a social layer, and an information discovery platform.
All of this in service of a key question we’re facing in this field: how do you get the right data skills, tools, and workflows, in the hands of the people who need them, when the space is evolving so quickly?
Links
Norma's website
EmpiricaLab
Norma on twitter

5 snips
Sep 30, 2022 • 1h 33min
Episode 12: Data Science for Social Media: Twitter and Reddit
Hugo speakswith Katie Bauer about her time working in data science at both Twitter and Reddit. At the time of recording, Katie was a data science manager at Twitter and prior to that, a founding member of the data team at Reddit. She’s now Head of Data Science at Gloss Genius so congrats on the new job, Katie!
In this conversation, we dive into what type of challenges social media companies face that data science is equipped to solve: in doing so, we traverse
the difference and similarities in companies such as Twitter and Reddit,
the major differences in being an early member of a data team and joining an established data function at a larger organization,
the supreme importance of robust measurement and telemetry in data science, along with
the mixed incentives for career data scientists, such as building flashy new things instead of maintaining existing infrastructure.
I’ve always found conversations with Katie to be a treasure trove of insights into data science and machine learning practice, along with key learnings about data science management.
In a word, Katie helps me to understand our space better. In this conversation, she told me that one important function data science can serve in any organization is creating a shared context for lots of different people in the org. We dive deep into what this actually means, how it can play out, traversing the world of dashboards, metric stores, feature stores, machine learning products, the need for top-down support, and much, much more.