Learning Bayesian Statistics

Alexandre Andorra

Are you a researcher or data scientist / analyst / ninja? Do you want to learn Bayesian inference, stay up to date or simply want to understand what Bayesian inference is?

Then this podcast is for you! You'll hear from researchers and practitioners of all fields about how they use Bayesian statistics, and how in turn YOU can apply these methods in your modeling workflow.

When I started learning Bayesian methods, I really wished there were a podcast out there that could introduce me to the methods, the projects and the people who make all that possible.

So I created "Learning Bayesian Statistics", where you'll get to hear how Bayesian statistics are used to detect black matter in outer space, forecast elections or understand how diseases spread and can ultimately be stopped.

But this show is not only about successes -- it's also about failures, because that's how we learn best. So you'll often hear the guests talking about what *didn't* work in their projects, why, and how they overcame these challenges. Because, in the end, we're all lifelong learners!

My name is Alex Andorra by the way, and I live in Estonia. By day, I'm a data scientist and modeler at the https://www.pymc-labs.io/ (PyMC Labs) consultancy. By night, I don't (yet) fight crime, but I'm an open-source enthusiast and core contributor to the python packages https://docs.pymc.io/ (PyMC) and https://arviz-devs.github.io/arviz/ (ArviZ). I also love https://www.pollsposition.com/ (election forecasting) and, most importantly, Nutella. But I don't like talking about it – I prefer eating it.

So, whether you want to learn Bayesian statistics or hear about the latest libraries, books and applications, this podcast is for you -- just subscribe! You can also support the show and https://www.patreon.com/learnbayesstats (unlock exclusive Bayesian swag on Patreon)!

Episodes

Mentioned books

Jul 3, 2020 • 1h

#19 Turing, Julia and Bayes in Economics, with Cameron Pfiffer

Do you know Turing? Of course you do! With Soss and Gen, it’s one of the blockbusters to do probabilistic programming in Julia. And in this episode Cameron Pfiffer will tell us all about it — how it came to life, how it fits into the probabilistic programming landscape, and what its main strengths and weaknesses are.Cameron did some Rust, some Python, but he especially loves coding in Julia. That’s also why he’s one of the core-developers of Turing.jl. He’s also a PhD student in finance at the University of Oregon and did his master’s in finance at the University of Reading. His interests are pretty broad, from cryptocurrencies, algorithmic and high-frequency trading, to AI in financial markets and anomaly detection – in a nutshell he’s a fan of topics where technology is involved.As he’s the first economist to come to the show, I also asked him how Bayesian the field of economics is, why he thinks economics is quite unique among the social sciences, and how economists think about causality — I later learned that this topic is pretty controversial!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Bayesian Econometrics on Cameron's Blog: http://cameron.pfiffer.org/2020/03/24/bayesian-econometrics/Cameron on Twitter: https://twitter.com/cameron_pfifferCameron on GitHub: https://github.com/cpfifferTuring.jl -- Bayesian inference in Julia: https://turing.ml/dev/Gen.jl -- Programmable inference embedded in Julia: https://www.gen.dev/Soss.jl -- Probabilistic programming via source rewriting: https://github.com/cscherrer/Soss.jlThe Julia Language -- A fresh approach to technical computing: https://julialang.org/What is Probabilistic Programming -- Cornell University: http://adriansampson.net/doc/ppl.htmlMostly Harmless Econometrics Book: http://www.mostlyharmlesseconometrics.com/Thank you to my Patrons for making this episode possible!Yusuke Saito, Avi Bryant, Ero Carrera, Brian Huey, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, Adam Bartonicek, William Benton, Alan O'Donnell, Mark Ormsby, Demetri Pananos, James Ahloy, Jon Berezowski, Robin Taylor, Thomas Wiecki, Chad Scherrer, Vincent Arel-Bundock, Nathaniel Neitzke, Zwelithini Tunyiswa, Elea McDonnell Feit, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Joshua Duncan, Ian Moran and Paul Oreto.

Jun 26, 2020 • 8min

#SpecialAnnouncement: Patreon Launched!

I hope you’re all safe! Some of you also asked me if I had set up a Patreon so that they could help support the show, and that’s why I’m sending this short special episode your way today. I had thought about that, but I wasn’t sure there was a demand for this. Apparently, there is one — at least a small one — so, first, I wanna thank you and say how grateful I am to be in a community that values this kind of work!The Patreon page is now live at patreon.com/learnbayesstats. It starts as low as 3€ and you can pick from 4 different tiers:"Maximum A Posteriori" (3€): Join the Slack, where you can ask questions about the show, discuss with like-minded Bayesians and meet them in-person when you travel the world."Full Posterior" (5€): Previous tier + Your name in all the show notes, and I'll express my gratitude to you in the first episode to go out after your contribution. You also get early access to the special episodes. -- that I'll make at an irregular pace and will include panel discussions, book releases, live shows, etc."Principled Bayesian" (20€): Previous tiers + Every 2 months, I'll ask my guest two questions voted-on by "Principled Bayesians". I'll probably do that with a poll in the Slack channel, which will be only answered by the "Principled Bayesians" and of these questions, I will ask the top 2 every two months on the show. "Good Bayesian" (200€, only 8 spots): Previous tiers + Every 2 months, you can come on the show and you ask one question to the guest without a vote. So that's why I can't have too many people in that tier.Before telling you the best part: I already have a lot of ideas for exclusive content and options. I first need to see whether you're as excited as I am about it. If I see you are, I'll be able to add new perks to the tiers! So give me your feedback about the current tiers or any benefits you'd like to see there... but don't see yet! BTW, you have a new way to do that now: sending me voice messages at anchor.fm/learn-bayes-stats/message!Now, the icing on the cake: until July 31st, if you choose the "Full Posterior" tier (5$) or higher, you get early access to the very special episode I'm planning with Andrew Gelman, Jennifer Hill and Aki Vehtari about their upcoming book, "Regression and other stories". To top it off, there will be a promo code in the episode to buy the book at a discount price — now, that is an offer you can't turn down!Alright, that is it for today — I hope you’re as excited as I am for this new stage in the podcast’s life! Please keep the emails, the tweets, the voice messages, the carrier pigeons coming with your feedback, questions and suggestions.In the meantime, take care and I’ll see you in the next episode — episode 19, with Cameron Pfiffer, who’s the first economist to come on the show and who’s a core-developer of Turing.jl. We’re gonna talk about the Julia probabilistic programming landscape, Bayes in economics and causality — it’s gonna be fun ;) Again, patreon.com/learnbayesstats if you want to support the show and unlock some nice perks. Thanks again, I am very grateful for any support you can bring me!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:LBS Patreon page: patreon.com/learnbayesstatsSend me voice messages: anchor.fm/learn-bayes-stats/message---Send in a voice message: https://anchor.fm/learn-bayes-stats/message

Jun 18, 2020 • 58min

#18 How to ask good Research Questions and encourage Open Science, with Daniel Lakens

Daniel Lakens, an experimental psychologist at Eindhoven University of Technology, dives into the art of crafting effective research questions and experimental designs. He sheds light on the importance of open science and how it can reshape funding and publishing practices. The discussion also tackles the ongoing reproducibility crisis in psychology and the value of acknowledging flawed research. Lakens champions transparency and collaboration, advocating for better statistical education to enhance the credibility of scientific findings.

Jun 4, 2020 • 52min

#17 Reparametrize Your Models Automatically, with Maria Gorinova

Have you already encountered a model that you know is scientifically sound, but that MCMC just wouldn’t run? The model would take forever to run — if it ever ran — and you would be greeted with a lot of divergences in the end. Yeah, I know, my stress levels start raising too whenever I hear the word « divergences »…Well, you’ll be glad to hear there are tricks to make these models run, and one of these tricks is called re-parametrization — I bet you already heard about the poorly-named non-centered parametrization?Well fear no more! In this episode, Maria Gorinova will tell you all about these model re-parametrizations! Maria is a PhD student in Data Science & AI at the University of Edinburgh. Her broad interests range from programming languages and verification, to machine learning and human-computer interaction. More specifically, Maria is interested in probabilistic programming languages, and in exploring ways of applying program-analysis techniques to existing PPLs in order to improve usability of the language or efficiency of inference.As you’ll hear in the episode, she thinks a lot about the language aspect of probabilistic programming, and works on the automation of various “tricks” in probabilistic programming: automatic re-parametrization, automatic marginalization, automatic and efficient model-specific inference.As Maria also has experience with several PPLs like Stan, Edward2 and TensorFlow Probability, she’ll tell us what she thinks a good PPL design requires, and what the future of PPLs looks like to her.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Maria on the Web: http://homepages.inf.ed.ac.uk/s1207807/index.htmlMaria on Twitter: https://twitter.com/migorinovaMaria on GitHub: https://github.com/mgorinovaAutomatic Reparameterisation of Probabilistic Programs (Maria's paper with Dave Moore and Matthew Hoffman): https://arxiv.org/abs/1906.03028Stan User's Guide on Reparameterization: https://mc-stan.org/docs/2_23/stan-users-guide/reparameterization-section.htmlHMC for hierarchical models -- Background on reparameterization: https://arxiv.org/abs/1312.0906NeuTra -- Automatic reparameterization: https://arxiv.org/abs/1903.03704Edward2 -- A library for probabilistic modeling, inference, and criticism: http://edwardlib.org/Pyro -- Automatic reparameterization and marginalization: https://pyro.ai/Gen -- Programmable inference: http://probcomp.csail.mit.edu/software/gen/TensorFlow Probability: https://www.tensorflow.org/probability/

May 21, 2020 • 1h 8min

#16 Bayesian Statistics the Fun Way, with Will Kurt

Will Kurt, lead Data Scientist at Hopper, shares insights on Bayesian statistics, his journey from a Boston librarian to a data scientist, and the value of Bayesian inference. He discusses the mind projection fallacy, logistic regression, upcoming plans, and promoting critical thinking in society.

May 6, 2020 • 1h 6min

#15 The role of Python in Science and Education, with Michael Kennedy

This is it folks! This is the first of the special episodes I want to do from time to time, to expand our perspective and get inspired by what’s going on elsewhere. The guests will not come directly from the Bayesian world, but will still be related to science or programming.For the first episode of the kind, I had the chance to chat with Michael Kennedy! Michael is not only a very knowledgeable and respected member of the Python community, he’s also the founder and host of Talk Python To Me, the most popular Python podcast. He’s the founder and chief author at Talk Python Training, where he develops many Python developer online courses. And before that, Michael was a professional software trainer for over 10 years – he has taught numerous developers throughout the world! But Michael is not only an entrepreneur and teacher – he’s also a father, a husband, and a proud inhabitant of Portland, OR! As you’ll hear, our conversation spanned a large array of topics — the role of Python in science and research; how it came to be so important in data science, and why; what are Python’s threats and weaknesses and how it should evolve to not become obsolete. Michael also has interesting thoughts on the role of programming in education and how it relates to geometry — but I’ll let you discover that one by yourself…Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Michael on Twitter: https://twitter.com/mkennedyThe Talk Python Podcast: https://talkpython.fm/The Python Bytes Podcast: https://pythonbytes.fm/Michael's blog: https://blog.michaelckennedy.net/Michael on Crowdcast: https://www.crowdcast.io/mkennedyJupytext -- Turn Jupyter Notebooks to scripts and (R) Markdown files: https://jupytext.readthedocs.io/en/latest/introduction.html

Apr 22, 2020 • 49min

#14 Hidden Markov Models & Statistical Ecology, with Vianey Leos-Barajas

I bet you love penguins, right? The same goes for koalas, or puppies! But what about sharks? Well, my next guest loves sharks — she loves them so much that she works a lot with marine biologists, even though she’s a statistician! Vianey Leos Barajas is indeed a statistician primarily working in the areas of statistical ecology, time series modeling, Bayesian inference and spatial modeling of environmental data. Vianey did her PhD in statistics at Iowa State University and is now a postdoctoral researcher at North Carolina State University.In this episode, she’ll tell us what she’s working on that involves sharks, sheep and other animals! Trying to model animal movements, Vianey often encounters the dreaded multimodal posteriors. She’ll explain why these can be very tricky to estimate, and why ecological data are particularly suited for hidden Markov models and spatio-temporal models — don’t worry, Vianey will explain what these models are in the episode!Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Vianey on Twitter: https://twitter.com/vianey_lbHidden Markov Models in the Stan User's Guide: https://mc-stan.org/docs/2_18/stan-users-guide/hmms-section.htmlTagging Basketball Events with HMM in Stan: https://mc-stan.org/users/documentation/case-studies/bball-hmm.htmlHMMs with Python and PyMC3: https://ericmjl.github.io/bayesian-analysis-recipes/notebooks/markov-models/The Discrete Adjoint Method -- Efficient Derivatives for Functions of Discrete Sequences (Betancourt, Margossian, Leos-Barajas): https://arxiv.org/abs/2002.00326Vianey will be doing an HMM 90-minute introduction at the International Statistical Ecology Conference in June 2020: http://www.isec2020.org/Stan for Ecology -- a website for the ecology community in Stan: https://stanecology.github.io/LatinR 2020 -- 7th to 9th October 2020: https://latin-r.com/Migramar -- Science for the Conservation of Marine Migratory Species in the Eastern Pacific: http://migramar.org/hi/en/Pelagios Kakunja -- Know, educate and conserve for a sustainable sea: https://www.pelagioskakunja.org/Book recommendations:Hidden Markov Models for Time Series: https://www.routledge.com/Hidden-Markov-Models-for-Time-Series-An-Introduction-Using-R-Second-Edition/Zucchini-MacDonald-Langrock/p/book/9781482253832Handbook of Mixture Analysis: https://www.routledge.com/Handbook-of-Mixture-Analysis-1st-Edition/Fruhwirth-Schnatter-Celeux-Robert/p/book/9781498763813Pattern Recognition and Machine Learning: http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf

Apr 8, 2020 • 44min

#13 Building a Probabilistic Programming Framework in Julia, with Chad Scherrer

How is Julia doing? I’m talking about the programming language, of course! What does the probabilistic programming landscape in Julia look like? What are Julia’s distinctive features, and when would it be interesting to use it?To talk about that, I invited Chad Scherrer. Chad is a Senior Research Scientist at RelationalAI, a company that uses Artificial Intelligence technologies to solve business problems.Coming from a mathematics background, Chad did his PhD at Indiana University of Bloomington and has been working in statistics and data science for a decade now. Through this experience, he’s been using and developing probabilistic programming languages – so he’s familiar with python, R, PyMC, Stan and all the blockbusters of the field. But since 2018, he’s particularly interested in Julia and developed Soss, an open-source lightweight probabilistic programming package for Julia. In this episode, he’ll tell us why he decided to create this package, and which choices he made that made Soss what it is today. But we’ll also talk about other projects in Julia, like Turing or Gen for instance.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Chad's Website: https://cscherrer.github.io/Chad on Twitter: https://twitter.com/ChadScherrerSoss Package: https://github.com/cscherrer/Soss.jlSoss Presentation at 2019 Strata NYC: https://slides.com/cscherrer/2019-09-26-strata#/Passage -- A Parallel Sampler Generator for Hierarchical Bayesian Modeling: https://bit.ly/2UTmaYBDynamic HMC in Julia: https://github.com/tpapp/DynamicHMC.jlAdvanced HMC in Julia: https://github.com/TuringLang/AdvancedHMC.jlMonte Carlo Measurements in Julia: https://github.com/baggepinnen/MonteCarloMeasurements.jlTuring.jl -- Bayesian inference with probabilistic programming: https://turing.ml/dev/Gen.jl -- Probabilistic modeling and inference in Julia: https://www.gen.dev/Etalumis -- Bringing Probabilistic Programming to Scientific Simulators at Scale: https://arxiv.org/abs/1907.03382Omega.jl -- A programming language for causal and probabilistic reasoning: http://www.zenna.org/Omega.jl/latest/JuliaLang -- The Ingredients for a Composable Programming Language: https://white.ucc.asn.au/2020/02/09/whycompositionaljulia.htmlSimpy -- Discrete event simulation for Python: https://simpy.readthedocs.io/en/latest/

Mar 25, 2020 • 47min

#12 Biostatistics and Differential Equations, with Demetri Pananos

Do you know Google Summer of Code? It’s a time of year when students can contribute to open-source software by developing and adding much needed functionalities to the open-source package of their choice. And Demetri Pananos did just that.He did it in 2019 with PyMC3, for which he developed the API for ordinary differential equations. In this episode, he’ll tell us why and how he did that, what he learned from the experience, and what the strengths and weaknesses of the API are in his opinion.Demetri is a Ph.D candidate in Biostatistics at Western University, in Ontario, Canada. His research interests surround machine learning and Bayesian statistics for personalized medicine. He earned his Master’s in Applied Mathematics from The University of Waterloo and is a firm believer in open science, interdisciplinary collaboration, and reproducible research. Other than that, he loves plotting data and drinking IPA beer – well, who doesn’t?”Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Demetri on Twitter: https://twitter.com/PhDemetriDemetri on GitHub: https://github.com/DpananosDemetri's website: https://dpananos.github.io/PyMC3, Probabilistic Programming in Python: https://docs.pymc.io/Chris Bishop, Pattern Recognition and Machine Learning: https://www.amazon.fr/Pattern-Recognition-Machine-Learning-Christopher/dp/0387310738Bayesian Data Analysis (Gelman, Carlin, Stern, Dunson, Vehtari, Rubin): http://www.stat.columbia.edu/~gelman/book/Parallel Plots: https://arviz-devs.github.io/arviz/generated/arviz.plot_parallel.html

Mar 11, 2020 • 58min

#11 Taking care of your Hierarchical Models, with Thomas Wiecki

I bet you already heard about hierarchical models, or multilevel models, or varying-effects models — yeah this type of models has a lot of names! Many people even turn to Bayesian tools to build _exactly_ these models. But what are they? How do you build and use a hierarchical model? What are the tricks and classical traps? And even more important: how do you _interpret_ a hierarchical model?In this episode, Thomas Wiecki will come to the rescue and explain what multilevel models are, how to build them, what their powers are… but also why you should be very careful when building them…Does the name Thomas Wiecki ring a bell? Probably because he’s the host and creator of the PyData Deep Dive Podcast, where he interviews open-source contributors from the Python and Data Science worlds! Thomas is also the VP of Data Science at Quantopian, a crowd-sourced quantitative investment firm that encourages people everywhere to write investment algorithms.Finally, Thomas is a longtime Bayesian and core-developer of PyMC3, a fantastic python package to do probabilistic programming in Python. On his blog, he publishes tutorial articles and explores new ideas such as Bayesian Deep Learning. Caring a lot about open-source software sustainability, he puts all he’s up to on his Patreon page, that you’ll find in the show notes.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Links from the show:Thomas’ series on Hierarchical Regression: https://twiecki.io/blog/2013/08/12/bayesian-glms-1/Non-centered Parametrization with PyMC3: https://twiecki.io/blog/2017/02/08/bayesian-hierchical-non-centered/Using Bayesian Decision Making: https://twiecki.io/blog/2019/01/14/supply_chain/PyMC3 - Probabilistic Programming in Python: https://docs.pymc.io/Symbolic PyMC: https://pymc-devs.github.io/symbolic-pymc/PyData Deep Dive Podcast: https://pydata-podcast.comThomas on Twitter: https://twitter.com/twiecki?lang=enThomas on Patreon: https://www.patreon.com/twieckiThomas on GitHub: https://github.com/twieckiAlex’s Hierarchical Model of Elections in Paris: https://mybinder.org/v2/gh/AlexAndorra/pollsposition_models/master?urlpath=%2Fvoila%2Frender%2Fdistrict-level%2Fmunic_model_analysis.ipynb

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner