

Learning Bayesian Statistics
Alexandre Andorra
Are you a researcher or data scientist / analyst / ninja? Do you want to learn Bayesian inference, stay up to date or simply want to understand what Bayesian inference is?
Then this podcast is for you! You'll hear from researchers and practitioners of all fields about how they use Bayesian statistics, and how in turn YOU can apply these methods in your modeling workflow.
When I started learning Bayesian methods, I really wished there were a podcast out there that could introduce me to the methods, the projects and the people who make all that possible.
So I created "Learning Bayesian Statistics", where you'll get to hear how Bayesian statistics are used to detect black matter in outer space, forecast elections or understand how diseases spread and can ultimately be stopped.
But this show is not only about successes -- it's also about failures, because that's how we learn best. So you'll often hear the guests talking about what *didn't* work in their projects, why, and how they overcame these challenges. Because, in the end, we're all lifelong learners!
My name is Alex Andorra by the way, and I live in Estonia. By day, I'm a data scientist and modeler at the https://www.pymc-labs.io/ (PyMC Labs) consultancy. By night, I don't (yet) fight crime, but I'm an open-source enthusiast and core contributor to the python packages https://docs.pymc.io/ (PyMC) and https://arviz-devs.github.io/arviz/ (ArviZ). I also love https://www.pollsposition.com/ (election forecasting) and, most importantly, Nutella. But I don't like talking about it – I prefer eating it.
So, whether you want to learn Bayesian statistics or hear about the latest libraries, books and applications, this podcast is for you -- just subscribe! You can also support the show and https://www.patreon.com/learnbayesstats (unlock exclusive Bayesian swag on Patreon)!
Then this podcast is for you! You'll hear from researchers and practitioners of all fields about how they use Bayesian statistics, and how in turn YOU can apply these methods in your modeling workflow.
When I started learning Bayesian methods, I really wished there were a podcast out there that could introduce me to the methods, the projects and the people who make all that possible.
So I created "Learning Bayesian Statistics", where you'll get to hear how Bayesian statistics are used to detect black matter in outer space, forecast elections or understand how diseases spread and can ultimately be stopped.
But this show is not only about successes -- it's also about failures, because that's how we learn best. So you'll often hear the guests talking about what *didn't* work in their projects, why, and how they overcame these challenges. Because, in the end, we're all lifelong learners!
My name is Alex Andorra by the way, and I live in Estonia. By day, I'm a data scientist and modeler at the https://www.pymc-labs.io/ (PyMC Labs) consultancy. By night, I don't (yet) fight crime, but I'm an open-source enthusiast and core contributor to the python packages https://docs.pymc.io/ (PyMC) and https://arviz-devs.github.io/arviz/ (ArviZ). I also love https://www.pollsposition.com/ (election forecasting) and, most importantly, Nutella. But I don't like talking about it – I prefer eating it.
So, whether you want to learn Bayesian statistics or hear about the latest libraries, books and applications, this podcast is for you -- just subscribe! You can also support the show and https://www.patreon.com/learnbayesstats (unlock exclusive Bayesian swag on Patreon)!
Episodes
Mentioned books

Feb 26, 2020 • 44min
#10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen
How do you handle your MCMC samples once your Bayesian model fit properly? Which diagnostics do you check to see if there was a computational problem? And isn’t that nice when you have beautiful and reliable plots to complement your analysis and better understand your model?I know what you think: plotting can be long and complicated in these cases. Well, not with ArviZ, a platform-agnostic package to do exploratory analysis of your Bayesian models. And in this episode, Ari Hartikainen will tell you why.Ari is a data-scientist in geophysics and a researcher at the Department of Civil Engineering of Aalto University in Finland. He mainly works on geophysics, Bayesian statistics and visualization. Ari’s also a prolific open-source contributor, as he’s a core-developer of the popular Stan and ArviZ libraries. He’ll tell us how PyStan interacts with ArviZ, what he thinks ArviZ most useful features are, and which common difficulties he encounters with his models and data.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Ari on GitHub: https://github.com/ahartikainenAri on Twitter: https://twitter.com/a_hartikainenArviZ -- Exploratory analysis of Bayesian models: https://arviz-devs.github.io/arviz/Introductory paper of ArviZ in The Journal of Open Source Software: https://www.researchgate.net/publication/330402908_ArviZ_a_unified_library_for_exploratory_analysis_of_Bayesian_models_in_PythonStan -- Statistical Modeling Platform: https://mc-stan.org/GPflow -- Gaussian processes in TensorFlow: https://www.gpflow.org/GPy -- Gaussian processes framework in Python: https://sheffieldml.github.io/GPy/

Feb 12, 2020 • 54min
#9 Exploring the Cosmos with Bayes and Maggie Lieu
Have you always wondered what dark matter is? Can we even see it — let alone measure it? And what would discover it imply for our understanding of the Universe?In this episode, we’ll take look at the cosmos with Maggie Lieu. She’ll tell us what research in astrophysics is made of, what model she worked on at the European Space Agency, and how Bayesian the world of space science is.Maggie Lieu did her PhD in the Astronomy & Space Department of the University of Birmingham. She’s now a Research Fellow of Machine Learning & Cosmology at the University of Nottingham and is working on projects in preparation for Euclid, a space-based telescope whose goal is to map the dark Universe and help us learn about the nature of dark matter and dark energy.In a nutshell, she tries to help us better understand the entire cosmos. Even more amazing, she uses the Stan library and applies Bayesian statistical methods to decipher her astronomical data! But Maggie is not just a Bayesian astrophysicist: she also loves photography and rock-climbing!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Maggie's Website: https://maggielieu.com/Maggie's Google Scholar Page: https://scholar.google.co.uk/citations?user=ilfwfuUAAAAJ&hl=enMaggie on Twitter: https://twitter.com/Space_MogMaggie on GitHub: https://github.com/MaggieLieuMaggie on YouTube: https://www.youtube.com/channel/UClO6TuRE6XLzbMBmQ_KY38AStan -- Statistical Modeling Platform: https://mc-stan.org/Stan's YouTube Channel: https://www.youtube.com/channel/UCwgN5srGpBH4M-Zc2cAluOA

Jan 29, 2020 • 49min
#8 Bayesian Inference for Software Engineers, with Max Sklar
What is it like using Bayesian tools when you’re a software engineer or computer scientist? How do you apply these tools in the online ad industry? More generally, what is Bayesian thinking, philosophically? And is it really useful in every day life? Because, well you can’t fire up MCMC each time you need to make a quick decision under uncertainty… So how do you do that in practice, when you have at most a pen and paper?In this episode, you’ll hear Max Sklar’s take on these questions. Max is a software engineer with a focus on machine learning and Bayesian inference. Now working at Foursquare’s innovation lab, he recently led the development of a causality model for Foursquare’s Ad Attribution product and taught a course on Bayesian Thinking at the Lviv Data Science Summer School.Max is also an open-source enthusiast and a fellow podcaster – he’s the host of the Local Maximum podcast, where you can hear every week about the latest trends in AI, machine learning and technology from an engineering perspective.Ow, and if you liked the movie « Her », with Joaquin Phoenix, well you’re in for a treat at the end of this episode…Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Local Maximum podcast website: https://www.localmaxradio.comMax on Twitter: https://twitter.com/maxsklarBayesian linear models: https://github.com/maxsklar/BayesPy/tree/master/LinearModelsBayesian Dirichlet-Multinomial estimation: https://github.com/maxsklar/BayesPy/tree/master/DirichletEstimationBayesian Thinking for Applied Machine Learning slides: https://docs.google.com/presentation/d/1eiceuvXlsoFKoHdqjF3qXBkyht7vR0YXQPG82ady-TU/edit?usp=sharing

Jan 16, 2020 • 46min
#7 Designing a Probabilistic Programming Language & Debugging a Model, with Junpeng Lao
You can’t study psychology up until your PhD and end-up doing very mathematical and computational data science at Google right? It’s too hard of a U-turn — some would even say it’s NUTS, just because they like bad puns… Well think again, because Junpeng Lao did just that!Before doing data science at Google, Junpeng was a cognitive psychology researcher at the University of Fribourg, Switzerland. Working in Python, Matlab and occasionally in R, Junpeng is a prolific open-source contributor, particularly to the popular TensorFlow and PyMC3 libraries. He also maintains the PyMC Discourse on his free time, where he amazingly answers all kinds of various and very specific questions!In this episode, he’ll tell you what the core characteristics of TensorFlow Probability are, and when you would use TFP instead of another probabilistic programming framework, like Stan or PyMC3. He’ll also explain why PyMC4 will be based on TensorFlow Probability itself, and what future contributions he has in mind for these two amazing libraries. Finally, Junpeng will share with you his workflow for debugging a model, or just for better understanding your models.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show: Junpeng's blog: https://junpenglao.xyz/Junpeng on Twitter: https://twitter.com/junpenglaoJunpeng on GitHub: https://github.com/junpenglaoAdvanced Bayesian Modeling Tutorial: https://discourse.pymc.io/t/advance-bayesian-modelling-with-pymc3/1439Stan Devs' Prior Choice Recommendations: https://github.com/stan-dev/stan/wiki/Prior-Choice-RecommendationsPyMC Discourse: https://discourse.pymc.io/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Tensor Flow Probability: https://www.tensorflow.org/probability/

12 snips
Jan 3, 2020 • 1h 4min
#6 A principled Bayesian workflow, with Michael Betancourt
The podcast discusses a principled Bayesian workflow with Michael Betancourt, highlighting the challenges of building models and the importance of questioning default settings. Michael shares insights on Bayesian vs. frequentist methods in inference, mastering the Bayesian workflow, diverse projects in the Stan team, and personal endeavors. The episode also covers custom model building, upcoming courses on advanced topics, and resources for Bayesian methods.

Dec 17, 2019 • 47min
#5 How to use Bayes in the biomedical industry, with Eric Ma
I have two questions for you: Are you a self-learner? Then how do you stay up to date? What should you focus on if you’re a beginner, or if you’re more advanced?And here is my second question: Are you working in biomedicine? And if you do, are you using Bayesian tools? Then how do you get your co-workers more used to posterior distributions than p-values? In other words, how do you change behaviors in a large organization?In this episode, Eric Ma will answer all these questions and even tell us his favorite modeling techniques, which problems he encountered with these models, and how he solved them. He’ll also share with us the software-engineering workflow he uses at Novartis to share his work with colleagues.Eric is a data scientist at the Novartis Institutes for Biomedical Research, where he focuses on Bayesian statistical methods to make medicines for patients. Eric is also a prolific open source developer: he led the development of pyjanitor, an API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He also contributes to PyMC3, matplotlib and bokeh.This is « Learning Bayesian Statistics », episode 5, recorded October 21, 2019.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Eric's website: https://ericmjl.github.io/Eric on Twitter: https://twitter.com/ericmjlBayesian analysis recipes: https://github.com/ericmjl/bayesian-analysis-recipesBayesian deep learning demystified: https://github.com/ericmjl/bayesian-deep-learning-demystifiedCausality repo: https://github.com/ericmjl/causalityPyjanitor - Convenient data cleaning routines for repetitive tasks: https://pyjanitor.readthedocs.io/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Panel - A high-level app and dashboarding solution for Python: https://panel.pyviz.org/Nxviz - Visualization Package for NetworkX: https://nxviz.readthedocs.io/en/latest/

Dec 4, 2019 • 49min
#4 Dirichlet Processes and Neurodegenerative Diseases, with Karin Knudson
What do neurodegenerative diseases, gerrymandering and ecological inference all have in common? Well, they can all be studied with Bayesian methods — and that’s exactly what Karin Knudson is doing.In this episode, Karin will share with us the vital and essential work she does to understand aspects of neurodegenerative diseases. She’ll also tell us more about computational neuroscience and Dirichlet processes — what they are, what they do, and when you should use them.Karin did her doctorate in mathematics, with a focus on compressive sensing and computational neuroscience at the University of Texas at Austin. Her doctoral work included applying hierarchical Dirichlet processes in the setting of neural data and focused on one-bit compressive sensing and spike-sorting.Formerly the chair of the math and computer science department of Phillips Academy Andover, she started a postdoc at Mass General Hospital and Harvard Medical in Fall 2019. Most importantly, rock climbing and hiking have no secrets for her!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ ! Links from the show, personally curated by Karin Knudson:Karin on Twitter: https://twitter.com/karinknudsonSpike train entropy-rate estimation using hierarchical Dirichlet process priors (Knudson and Pillow): https://pillowlab.princeton.edu/pubs/abs_Knudson_HDPentropy_NIPS13.htmlFighting Gerrymandering with PyMC3, PyCon 2018, Colin Carroll and Karin Knudson: https://www.youtube.com/watch?v=G9I5ZnkWR0AExpository resources on Dirichlet Processes: Chapter 23 of Bayesian Data Analysis (Gelman et al.) and http://www.gatsby.ucl.ac.uk/~ywteh/research/npbayes/dp.pdfHierarchical Dirichlet Processes (introduced the HDP and included applications in topic modeling and for working with time-series data and Hidden Markov Models): https://www.stat.berkeley.edu/~aldous/206-Exch/Papers/hierarchical_dirichlet.pdfA Sticky HDP-HMM with applications to speaker diarization (a nice example of how the HDP can be used with HMM, in this case cleverly adapted so that states have more persistence): https://arxiv.org/abs/0905.2592If you want to get deeper into the weeds and also get a sense of the history: Dirichlet Processes with Applications to Bayesian Nonparametric Problems (https://projecteuclid.org/euclid.aos/1176342871) and A Bayesian Analysis of Some Nonparametric Problems (https://projecteuclid.org/euclid.aos/1176342360)

Nov 18, 2019 • 32min
#3.2 How to use Bayes in industry, with Colin Carroll
Colin Carroll discusses implementing Bayesian tools in finance and the airline industry, emphasizing effective communication to non-technical stakeholders. He explores challenges in model fitting with golfers' accuracy data, importance of pre-processing features, and practical applications of Bayesian methods. The future of probabilistic programming frameworks is also discussed, along with admiration for Professor Gilbert Strang.

Nov 5, 2019 • 33min
#3.1 What is Probabilistic Programming & Why use it, with Colin Carroll
Colin Carroll, a machine learning researcher and key contributor to PyMC3 and ArviZ, discusses the intricacies of probabilistic programming. He explains its value in the realm of Bayesian statistics and provides insights on selecting between various libraries like Stan and Pyro based on project requirements. Colin shares his journey from pure mathematics to data science and emphasizes the importance of quantifying uncertainty for better decision-making, particularly in high-stakes scenarios like flight insurance.

5 snips
Oct 23, 2019 • 44min
#2 When should you use Bayesian tools, and Bayes in sports analytics, with Chris Fonnesbeck
Chris Fonnesbeck, senior quantitative analyst for the New York Yankees and associate professor at Vanderbilt, dives deep into the world of Bayesian methods. He illustrates when to effectively employ these techniques and the challenges of teaching them. Fonnesbeck highlights their application in sports analytics, particularly in baseball, alongside marine biology findings. He discusses the importance of skills like programming and understanding priors, while also addressing issues like missing data, showcasing Bayesian's growing relevance across disciplines.