Normal Curves: Sexy Science, Serious Statistics

Regina Nuzzo and Kristin Sainani
undefined
Jan 12, 2026 • 1h 14min

Bonus: Sugar Sag with Commentary

While we’re on a short break between seasons, we’re revisiting some of our favorite episodes from Season 1. This week, we’re re-releasing our exploration of how your diet can affect your skin – now with added commentary!Wrinkles and sagging skin—just normal aging, or can you blame your sweet tooth? We dive into “sugar sag,” exploring how sugar, processed foods, and even your crispy breakfast toast might be making you look older than if you’d said no to chocolate cake and yes to broccoli. Along the way, we encounter statistical adjustment, training and test data sets, what we call “references to nowhere,” plus some cadavers and collagen. Ever heard of an AGE reader? Find out how this tool might offer a sneak peek at your date’s age—and maybe even a clue about his… um… “performance.”Statistical topics ConfoundingCorrelation vs causationMeasurement error / proxy variablesOverfitting PlagiarismProper citing practicesReferences to nowhereStatistical adjustmentTraining and test setsMethodologic morals“When you plagiarize, you steal the errors too.”“Overdone statistical adjustment is like overdone photo filters–at a certain point it’s just laughable.”CitationsCollagen turnover: Verzijl N, DeGroot J, Thorpe SR, et al.Effect of Collagen Turnover on the Accumulation of Advanced Glycation End Products. JBC. 2000;275:39027-31.Cadaver study:Hamlin CR, Kohn RR, Luschin JH. Apparent Accelerated Aging of Human Collagen in Diabetes Mellitus. Diabetes. 1975; 24: 902–904.AGE ReaderStudies of AGEs and diabetes and health:Monnier VM, Cerami A. Nonenzymatic browning in vivo: possible process for aging of long-lived proteins. Science. 1981;211:491-3. Brownlee M, Vlassara H, Cerami A. Nonenzymatic glycosylation and the pathogenesis of diabetic complications. Ann Intern Med. 1984;101:527-37. Monnier VM, Vishwanath V, Frank KE, et al. Relation between Complications of Type I Diabetes Mellitus and Collagen-Linked Fluorescence. N Engl J Med. 1986;314:403-408.Monnier VM, Sell DR, Abdul-Karim FW, et al. Collagen browning and cross-linking are increased in chronic experimental hyperglycemia. Relevance to diabetes and aging. Diabetes. 1988;37:867-72. Monnier VM, Bautista O, Kenny D, et al. Skin collagen glycation, glycoxidation, and crosslinking are lower in subjects with long-term intensive versus conventional therapy of type 1 diabetes: relevance of glycated collagen products versus HbA1c as markers of diabetic complications. Diabetes 1999; 48: 870–80.Genuth S, Sun W, Cleary P, et al. Glycation and carboxymethyllysine levels in skin collagen predict the risk of future 10-year progression of diabetic retinopathy and nephropathy in the diabetes control and complications trial and epidemiology of diabetes interventions and complications participants with type 1 diabetes. Diabetes. 2005;54:3103-11. van Waateringe RP, Slagter SN, van Beek AP, et al. Skin autofluorescence, a non-invasive biomarker for advanced glycation end products, is associated with the metabolic syndrome and its individual components. Diabetol Metab Syndr. 2017;9:42. Kouidrat Y, Zaitouni A, Amad A, et al. Skin autofluorescence (a marker for advanced glycation end products) and erectile dysfunction in diabetes. J Diabetes Complications. 2017;3:108-113. Fujita N, Ishida M, Iwane T, et al. Association between Advanced Glycation End-Products, Carotenoids, and Severe Erectile Dysfunction. World J Mens Health. 2023;41:701-11. Uruska A, Gandecka A, Araszkiewicz A, et al. Accumulation of advanced glycation end products in the skin is accelerated in relation to insulin resistance in people with Type 1 diabetes mellitus. Diabet Med. 2019;36:620-625. Boersma HE, Smit AJ, Paterson AD, et al. Skin autofluorescence and cause-specific mortality in a population-based cohort. Sci Rep 2024;14:19967.Review article with conflicts of interest: Draelos ZD. Sugar Sag: What Is Skin Glycation and How Do You Combat It? J Drugs Dermatol. 2024; 23:s5-10.Clinical study on AGE interrupter cream:Draelos ZD, Yatskayer M, Raab S, Oresajo C. An evaluation of the effect of a topical product containing C-xyloside and blueberry extract on the appearance of type II diabetic skin. J Cosmet Dermatol. 2009;8:147-51. Our citation trail:2023 review article: Zgutka K, Tkacz M, Tomasiak, et al. A Role for Advanced Glycation End Products in Molecular Ageing. Int J Mol Sci. 2023; 24: 9881. Sentence: “Interestingly, strict control of blood sugar for 4 months reduced the production of glycosylated collagen by 25%, and low-sugar food prepared by boiling could also reduce the production of AGEs [152].”Reference 152 is a review article: Cao C, Xiao Z, Wu Y, et al. Diet and Skin Aging-From the Perspective of Food Nutrition. Nutrients. 2020;12:870. Sentence: “However, strict control of blood sugar for four months can reduce the production of glycosylated collagen by 25%, and low-sugar food prepared by boiling can also reduce the production of AGEs [93–95].”Reference 93 is a review article: Nguyen HP, Katta R. Sugar sag: Glycation and the role of diet in aging skin. Skin Ther Lett. 2015; 20: 1–5. Sentence: “Tight glycemic control over a 4-month period can result in a reduction of glycated collagen formation by 25%.37,38”Reference 94 and 38 is a review article: Draelos ZD. Aging skin: the role of diet: facts and controversies. Clin Dermatol. 2013;31:701-6. Sentence: “Tighter glycemic control can reduce glycated collagen by 25% in 4 months.” No citation given....
undefined
16 snips
Dec 29, 2025 • 1h 29min

Bonus: Vitamin D Part 1 with commentary

Dive into the vitamin D conundrum as the hosts unravel the claims of a deficiency epidemic. Discover the surprising surfer study from Hawaii that challenges assumptions about sun exposure and vitamin D levels. Explore the clash between differing medical guidelines and how arbitrary thresholds can warp our understanding of health. With a touch of statistical sleuthing, they highlight biases, conflicts of interest, and the nuances behind dietary recommendations. Get ready for a captivating discussion that questions the very data behind vitamin D health claims.
undefined
7 snips
Dec 15, 2025 • 47min

The Batman Effect: Do weird surprises make people nicer?

DescriptionNobody expects Batman—but when he shows up in a crowded subway car, are people suddenly more likely to help a passenger in need? This week on Normal Curves, we unpack a recent quasi-experimental field study involving a caped superhero costume, a prosthetic pregnancy belly, and some puzzled Italian commuters. Along the way, we demystify three common ways of describing effects for binary outcomes—risk differences, risk ratios, and odds ratios—and explain what they actually mean in plain language. We also do some statistical sleuthing, uncover a major problem hiding in the paper’s numbers, and debate what really counts as an effective Batman outfit.Statistical topicsabsolute vs relative effectsbinary outcomescoding errorsdata errors and quality controleffect size interpretationfield experimentsoddsodds ratiospercentage differencesquasi-experimental studiesrisk differencesrisk ratiosstatistical sleuthingMethodological morals“We love an uncluttered paper, but when it's missing the basics, it's like an empty fridge. Clean, yes, but dinner is not happening.”“Before you make a fancy model, make sure the numbers in the table in the text match.”ReferencesPagnini F, Grosso F, Cavalera C, et al. Unexpected events and prosocial behavior: the Batman effect. Npj Ment Health Res. 2025;4(1):57. Published 2025 Nov 3. doi:10.1038/s44184-025-00171-5PubPeer. Comments on “Unexpected events and prosocial behavior: the Batman effect.” Accessed December 2025.Sainani KL. Understanding odds ratios. PM R. 2011;3(3):263-267. doi:10.1016/j.pmrj.2011.01.009Nuzzo RL. Communicating measures of relative risk in plain English. PM R. 2022;14(2):283-287. doi:10.1002/pmrj.12761Sainani KL. How statistics can mislead. Am J Public Health. 2012;102:e3-4.Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro (03:42) - Why would Batman make people nicer? (07:33) - How they ran the experiment (17:50) - Did Batman save the day? Different ways to answer that (23:00) - What are odds and odds ratios? (30:00) - Where people get it wrong (34:52) - The plot twist: big numerical errors (41:20) - Did men or women give up their seat more often? (43:49) - Wrap-up and methodological morals
undefined
Dec 1, 2025 • 1h 4min

Holiday Survival Guide Part 2: The survey study edition

Does the temperature of your coffee six months ago really predict whether you feel gassy today? This week we dissect a new nutrition survey study on hot and cold beverage habits that claims to connect drink temperature with gut symptoms, anxiety, and more—despite relying on year-old memories and a blizzard of statistical tests. It’s the perfect case study for our Holiday Survival Guide Part 2, where we teach you how to talk with Uncle Joe at the dinner table about one of the most common—and most fraught—study designs in science: cross-sectional surveys. We walk through our easy checklist for making sense of results, show how recall bias and measurement error can skew the story, and reacquaint you with nonmonogamous Multiple-Testing Dude, who’s been very busy in this dataset. A friendly, practical guide to spotting when researchers are just torturing the data until it confesses.Statistical topicsConfoundingCross-sectional studiesFalse positivesMeasurement errorMultiple testingPICOT / PIVOT frameworkRecall biasResearch hypothesesSample size and powerSignal vs. noiseSMART frameworkStatistical significanceSubgroup analysesSurvey designTransparency and trustworthinessMethodological morals“When your measurement starts with ‘think back to last winter’ you might as well use a random number generator.”“If the effect is only significant in certain subgroups in certain seasons for certain outcomes, it might just be a bad case of gas.”ReferencesWu T, Doyle C, Ito J, et al. Cold Exposures in Relation to Dysmenorrhea among Asian and White Women. Int J Environ Res Public Health. 2023;21(1):56. Published 2023 Dec 30. doi:10.3390/ijerph21010056Wu T, Ramesh N, Doyle C, Hsu FC. Cold and hot consumption and health outcomes among US Asian and White populations. Br J Nutr. Published online September 18, 2025. doi:10.1017/S000711452510514XKristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro (04:36) - Did they have real research hypotheses? (10:29) - Observational or randomized experiment? (20:09) - PICOT and PIVOT (26:20) - Memory problems (32:03) - Five outcomes and measurement problems therein (36:56) - SMART (41:50) - Multiple Testing Dude is having a great time (52:36) - How big is the effect? (59:06) - Wrap-up and Irish Coffee rating scale
undefined
Nov 17, 2025 • 1h 2min

Holiday Survival Guide: How to talk about scientific studies around the dinner table

Does a little alcohol really make you speak a foreign language better? This week we unpack a quirky randomized trial that tested Dutch pronunciation after a modest buzz—and came to the opposite conclusion the researchers expected. We use it as the perfect holiday case study: instead of arguing with Uncle Joe at the dinner table, we’ll show you how to pull apart a scientific headline using a friendly, practical checklist anyone can learn. Along the way we stress-test the study’s claims, take a quick detour into what a .04% buzz actually looks like, and run our own before-and-after experiment with two brave science journalists at the ScienceWriters2025 conference in Chicago. A holiday survival guide with vodka tonics, statistical sleuthing, and a few surprisingly smooth French phrases.Statistical topicsAlternative explanationsArithmetic consistency / GRIM testBlindingEffect size / magnitudeGeneralizability / external validityObservational studies vs. experimentsOutcome measurementPICOT frameworkPlacebo and expectancy effectsPrimary outcomes / pre-specificationRandomized controlled trialsResearch hypothesesSample size SMART frameworkStatistical significance (signal vs. noise)Transparency and trustworthinessMethodological morals“​​You don't need a PhD to read a study. Just remember, PICOT and SMART.”“A decimal point can mean the difference between life and death. Details matter.”ReferencesRenner F, Kersbergen I, Field M, Werthmann J. Dutch courage? Effects of acute alcohol consumption on self-ratings and observer ratings of foreign language skills. J Psychopharmacol. 2018;32(1):116-122. doi:10.1177/0269881117735687Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com (00:00) - Intro (03:30) - Uncle Joe and the question of alcohol (07:20) - Randomized controlled trial (10:10) - PICOT mnemonic (15:43) - Just how drunk? (22:25) - Boring non-placeb (33:13) - Kristin’s SMART mnemonic (39:32) - How big of an effect? (50:46) - Two science journalists walk into a bar (57:00) - Martini scale and wrap-up
undefined
Nov 3, 2025 • 1h 13min

Shingles Shot and Dementia: Could one vaccine protect your brain?

What do chickenpox and shingles have to do with your brain? This week, we dig into two 2025 headline-grabbing studies that link the shingles shot to lower dementia rates. We start in Wales, where a birthday cutoff turned into the perfect natural experiment, and end in the U.S. with a multi-million-person megastudy. Featuring bias-variance Goldilockses, Fozzy-the-Bear regression discontinuities, a Barbie-versus-Oppenheimer showdown for propensity scores – and the hottest rebrand of inverse-probability weighting you’ll ever hear.Statistical topicsAbsolute vs. relative riskBias–variance tradeoffCausal inferenceCensoringConfoundingFuzzy regression discontinuity designHealthy-user biasInverse probability of treatment weighting (IPTW)Longitudinal studyNatural experimentNegative controlsOptimal bandwidthPropensity scoresSelection biasSubgroup analysisTriangular kernel weightsMethodological morals“Propensity scores are the lipstick you put on observational pigs.”“Natural experiments are a hot flirtation date with causality.”ReferencesEyting M, Xie M, Michalik F, Heß S, Chung S, Geldsetzer P. A natural experiment on the effect of herpes zoster vaccination on dementia. Nature. 2025 May;641(8062):438-446. doi: 10.1038/s41586-025-08800-x. Epub 2025 Apr 2. PMID: 40175543; PMCID: PMC12058522.Polisky V, Littmann M, Triastcyn A, et al. Varicella-zoster virus reactivation and the risk of dementia. Nat Med. Published online October 6, 2025. doi:10.1038/s41591-025-03972-5Sainani KL. Propensity scores: uses and limitations. PM&R 2012; 4:693-97.Detailed Show Notes PageKristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro and first gratuitous mention of sex (03:56) - What are shingles, chickenpox, and the vaccines against them? (12:30) - Fun facts about the varicella zoster and herpes viruses (18:00) - A natural experiment in Wales (21:54) - What is the Goldilocks optimal bandwidth? (26:17) - Fuzzy regression discontinuity design demystified (32:43) - Shingles vaccine vs dementia showdown (34:13) - Absolute risk reduction paradox (37:44) - Effects for men and women differ (41:07) - A giant longitudinal study (47:51) - Propensity scores demystified via Barbie and Oppenheimer (53:55) - Using propensity scores to make matches (58:08) - Inverse probability of treatment weighting demystified via more Barbenheimer (01:02:27) - Attempts to rename IPTW for TikTok (01:05:59) - Longitudinal study results (01:10:00) - Smooch ratings and methodological morals: pigs and hot dates
undefined
Oct 20, 2025 • 1h 5min

Scary Bridge Study: Can fear make you horny?

What if a haunted house makes your date look hotter? This week we dive into the infamous Scary Bridge Study — the 1970s classic that launched a thousand pop-psych takes on fear and lust. It’s the one with the swaying bridge, pretty “research assistant,” and phone number scrawled on torn paper. The study became legend, but how sturdy were its stats? We retrace the design, redo the numbers, and see how many math errors it takes to sway a suspension bridge. Along the way we find an erotic-fiction writing exercise, Adventure Dudes choosing their own experimental groups, and snarky replicators who tried (and failed) to make fear sexy again. We wrap with what the latest research says about when fear really does boost attraction — and when it backfires spectacularly. A Halloween story of danger, desire, and unconscious sexual drive. This episode has a video version! https://www.youtube.com/watch?v=2coWoS_3460Statistical topicsArithmetic checksChi-square testConfoundersGRIM testInter-rater reliabilityMeta-analysisNegative controlRandomizationReplication Sample sizeSignal vs. noiseStatistical sleuthingSubjective measurementT-testMethodological morals“Those who don't verify their numbers dig their own statistical graves.”“Famous doesn't mean flawless.”ReferencesBrown, NJ, Heathers, JA. The GRIM test: A simple technique detects numerous anomalies in the reporting of results in psychology. Social Psychological and Personality Science. 2017; 8(4):363-369.Dutton DG, Aron AP. Some evidence for heightened sexual attraction under conditions of high anxiety. J Pers Soc Psychol. 1974;30(4):510-517. doi:10.1037/h0037031Foster CA, Witcher BS, Campbell WK, Green JD. Arousal and attraction: Evidence for automatic and controlled processes. J Pers Soc Psychol. 1998;74(1):86-101.Kenrick DT, Cialdini R, Linder D. Misattribution under fear-producing circumstances: Four failures to replicate. Pers Soc Psychol Bull. 1979;5(3):329-334.van der Zee T, Anaya J, Brown NJL. Statistical heartburn: an attempt to digest four pizza publications from the Cornell Food and Brand Lab. BMC Nutr. 2017;3:54. Published 2017 Jul 10. doi:10.1186/s40795-017-0167-xhttp://www.prepubmed.org/grim_test/Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro: Fear and Flirtation on a Suspension Bridge (05:40) - A Classic 1970s Experiment with No IRB to be Found (11:15) - Adventure Dudes Choose Their Own Bridge (17:00) - The Sexy Story Scale (22:20) - Cool Factor and the Negative Control (28:54) - Grim Reaper Math (36:29) - T-Tests, Chi-Squares, and Shaky Results (42:44) - Electric Shocks and Damsels in Distress (50:49) - Replications and Rejections (58:39) - Wrap-Up, Methodological Morals, and a New Sexy Rating Scale
undefined
Oct 6, 2025 • 59min

Ultramarathons: Can vitamin D protect your bones?

Ultramarathoners push their bodies to the limit, but can a giant pre-race dose of vitamin D really keep their bones from breaking down? In this episode, we dig into a trial that tested this claim – and found  a statistical endurance event of its own: six highly interchangeable papers sliced from one small study.  Expect missing runners, recycled figures, and a peer-review that reads like stand-up comedy, plus a quick lesson in using degrees of freedom as your statistical breadcrumbs.Statistical topicsData cleaning and validationDegrees of freedomExploratory vs confirmatory analysisFalse positives and Type I errorIntention-to-treat principleMultiple testingOpen data and transparencyP-hackingSalami slicingParametric vs non-parametric testsPeer review qualityRandomized controlled trialsResearch reproducibilityStatistical sleuthingMethodological morals“Degrees of freedom are the breadcrumbs in statistical sleuthing. They reveal the sample size even when the authors do not.”“Publishing the same study again and again with only the outcomes swapped is Mad Libs Science, better known as salami slicing.”ReferencesBoswell, Rachel. Pre-race vitamin D could do wonders for ultrarunners’ bone health, according to science. Runner’s World. September 25, 2025. Mieszkowski J, Stankiewicz B, Kochanowicz A, et al. Ultra-Marathon-Induced Increase in Serum Levels of Vitamin D Metabolites: A Double-Blind Randomized Controlled Trial. Nutrients. 2020;12(12):3629. Published 2020 Nov 25. doi:10.3390/nu12123629Mieszkowski J, Borkowska A, Stankiewicz B, et al. Single High-Dose Vitamin D Supplementation as an Approach for Reducing Ultramarathon-Induced Inflammation: A Double-Blind Randomized Controlled Trial. Nutrients. 2021;13(4):1280. Published 2021 Apr 13. doi:10.3390/nu13041280Mieszkowski J, Brzezińska P, Stankiewicz B, et al. Direct Effects of Vitamin D Supplementation on Ultramarathon-Induced Changes in Kynurenine Metabolism. Nutrients. 2022;14(21):4485. Published 2022 Oct 25. doi:10.3390/nu14214485Mieszkowski J, Brzezińska P, Stankiewicz B, et al. Vitamin D Supplementation Influences Ultramarathon-Induced Changes in Serum Amino Acid Levels, Tryptophan/Branched-Chain Amino Acid Ratio, and Arginine/Asymmetric Dimethylarginine Ratio. Nutrients. 2023;15(16):3536. Published 2023 Aug 11. doi:10.3390/nu15163536Stankiewicz B, Mieszkowski J, Kochanowicz A, et al. Effect of Single High-Dose Vitamin D3 Supplementation on Post-Ultra Mountain Running Heart Damage and Iron Metabolism Changes: A Double-Blind Randomized Controlled Trial. Nutrients. 2024;16(15):2479. Published 2024 Jul 31. doi:10.3390/nu16152479Stankiewicz B, Kochanowicz A, et al. Single high-dose vitamin D supplementation impacts ultramarathon-induced changes in serum levels of bone turnover markers: a double-blind randomized controlled trial. J Int Soc Sports Nutr. 2025 Dec;22(1):2561661. doi: 10.1080/15502783.2025.2561661.Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com 00:00 Intro & claim of the episode 00:44 Runner’s World headline: Vitamin D for ultramarathoners 02:03 Kristin’s connection to running and vitamin D skepticism 03:32 Ultramarathon world—Regina’s stories and Death Valley race 06:29 What ultramarathons do to your bones 08:02 Boy story: four stress fractures in one race 10:00 Study design—40 male runners in Poland 11:33 Missing flow diagram and violated intention-to-treat 13:02 The intervention: 150,000 IU megadose 15:09 Blinding details and missing randomization info 17:13 Measuring bone biomarkers—no primary outcome specified 19:12 The wrong clinicaltrials.gov registration 20:35 Discovery of six papers from one dataset (salami slicing) 23:02 Why salami slicing misleads readers 25:42 Inconsistent reporting across papers 29:11 Changing inclusion criteria and sloppy methods 31:06 Typos, Polish notes, and misnumbered references 32:39 Peer review comedy gold—“Please define vitamin D” 36:06 Reviewer laziness and p-hacking admission 39:13 Results: implausible bone growth mid-race 41:16 Degrees of freedom sleuthing reveals hidden sample sizes 47:07 Open data? Kristin emails the authors 48:42 Lessons from Kristin’s own ultramarathon dataset 51:22 Fishing expeditions and misuse of parametric tests 53:07 Strength of evidence: one smooch each 54:44 Methodologic morals—Mad Libs Science & degrees of freedom breadcrumbs 56:12 Anyone can spot red flags—trust your eyes 57:34 Outro: skip the vitamin D shot before your next run 
undefined
Sep 22, 2025 • 1h 14min

P-Values: Are we using a flawed statistical tool?

P-values show up in almost every scientific paper, yet they’re one of the most misunderstood ideas in statistics. In this episode, we break from our usual journal-club format to unpack what a p-value really is, why researchers have fought about it for a century, and how that famous 0.05 cutoff became enshrined in science. Along the way, we share stories from our own papers—from a Nature feature that helped reshape the debate to a statistical sleuthing project that uncovered a faulty method in sports science. The result: a behind-the-scenes look at how one statistical tool has shaped the culture of science itself.Statistical topicsBayesian statisticsConfidence intervals Effect size vs. statistical significanceFisher’s conception of p-valuesFrequentist perspectiveMagnitude-Based Inference (MBI)Multiple testing / multiple comparisonsNeyman-Pearson hypothesis testing frameworkP-hackingPosterior probabilitiesPreregistration and registered reportsPrior probabilitiesP-valuesResearcher degrees of freedomSignificance thresholds (p < 0.05)Simulation-based inferenceStatistical power Statistical significanceTransparency in research Type I error (false positive)Type II error (false negative)Winner’s CurseMethodological morals“​​If p-values tell us the probability the null is true, then octopuses are psychic.”“Statistical tools don't fool us, blind faith in them does.”ReferencesNuzzo R. Scientific method: statistical errors. Nature. 2014 Feb 13;506(7487):150-2. doi: 10.1038/506150a. Nuzzo, R., 2015. Scientists perturbed by loss of stat tools to sift research fudge from fact. Scientific American, pp.16-18.Nuzzo RL. The inverse fallacy and interpreting P values. PM&R. 2015 Mar;7(3):311-4. doi: 10.1016/j.pmrj.2015.02.011. Epub 2015 Feb 25. Nuzzo, R., 2015. Probability wars. New Scientist, 225(3012), pp.38-41.Sainani KL. Putting P values in perspective. PM&R. 2009 Sep;1(9):873-7. doi: 10.1016/j.pmrj.2009.07.003.Sainani KL. Clinical versus statistical significance. PM&R. 2012 Jun;4(6):442-5. doi: 10.1016/j.pmrj.2012.04.014.McLaughlin MJ, Sainani KL. Bonferroni, Holm, and Hochberg corrections: fun names, serious changes to p values. PM&R. 2014 Jun;6(6):544-6. doi: 10.1016/j.pmrj.2014.04.006. Epub 2014 Apr 22. Sainani KL. The Problem with "Magnitude-based Inference". Med Sci Sports Exerc. 2018 Oct;50(10):2166-2176. doi: 10.1249/MSS.0000000000001645. Sainani KL, Lohse KR, Jones PR, Vickers A. Magnitude-based Inference is not Bayesian and is not a valid method of inference. Scand J Med Sci Sports. 2019 Sep;29(9):1428-1436. doi: 10.1111/sms.13491. Lohse KR, Sainani KL, Taylor JA, Butson ML, Knight EJ, Vickers AJ. Systematic review of the use of "magnitude-based inference" in sports science and medicine. PLoS One. 2020 Jun 26;15(6):e0235318. doi: 10.1371/journal.pone.0235318. Wasserstein, R.L. and Lazar, N.A., 2016. The ASA statement on p-values: context, process, and purpose. The American Statistician, 70(2), pp.129-133.Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro & claim of the episode (01:00) - Why p-values matter in science (02:44) - What is a p-value? (ESP guessing game) (06:47) - Big vs. small p-values (psychic octopus example) (08:29) - Significance thresholds and the 0.05 rule (09:00) - Regina’s Nature paper on p-values (11:32) - Misconceptions about p-values (13:18) - Fisher vs. Neyman-Pearson (history & feud) (16:26) - Botox analogy and type I vs. type II errors (19:41) - Dating app analogies for false positives/negatives (22:02) - How the 0.05 cutoff got enshrined (24:40) - Misinterpretations: statistical vs. practical significance (26:16) - Effect size, sample size, and “statistically discernible” (26:45) - P-hacking and researcher degrees of freedom (29:46) - Transparency, preregistration, and open science (30:52) - The 0.05 cutoff trap (p = 0.049 vs 0.051) (31:18) - The biggest misinterpretation: what p-values actually mean (33:29) - Paul the psychic octopus (worked example) (35:59) - Why Bayesian statistics differ (39:49) - Why aren’t we all Bayesian? (probability wars) (41:05) - The ASA p-value statement (behind the scenes) (43:16) - Key principles from the ASA white paper (44:15) - Wrapping up Regina’s paper (45:33) - Kristin’s paper on sports science (MBI) (48:10) - What MBI is and how it spread (50:43) - How Kristin got pulled in (Christie Aschwanden & FiveThirtyEight) (54:05) - Critiques of MBI and “Bayesian monster” rebuttal (56:14) - Spreadsheet autopsies (Welsh & Knight) (58:05) - Cherry juice example (why MBI misleads) (01:00:22) - Rebuttals and smoke & mirrors from MBI advocates (01:02:55) - Winner’s Curse and small samples (01:03:38) - Twitter fights & “establishment statistician” (01:05:56) - Cult-like following & Matrix red pill analogy (01:08:06) - Wrap-up
undefined
Sep 8, 2025 • 50min

Exercise and Cancer: Does physical activity improve colon cancer survival?

Exercise has long been hailed as cancer-fighting magic, but is there hard evidence behind the hype? In this episode, we tackle the CHALLENGE trial, a large phase III study of colon cancer patients that tested whether prescribed exercise could improve cancer-free survival. We translate clinical jargon into plain English, show why ratio statistics make splashy headlines while absolute differences tell the real story, and take a detour into why statisticians think survival analysis is downright sexy. And we even bring in a classic reality show to make sense of the numbers.Statistical topicsData and Safety Monitoring Board (DSMB)Hazard ratiosIntention-to-treat analysisInterim analysesKaplan-Meier curvesPhase III trialsRandomized clinical trialRates and rate ratiosRelative vs absolute differencesStratified randomization with minimizationSurvival analysisTime-to-event variablesMethodological morals“Ratio statistics sell headlines. Absolute differences sell truth.”“Survival analysis is this sexy stats tool that makes every moment and every Cox count.”ReferencesCourneya KS, Vardy JL, O'Callaghan CJ, et al. Structured Exercise after Adjuvant Chemotherapy for Colon Cancer. NEJM. 2025;393:13-25. Rabin RC. Are Marathons and Extreme Running Linked to Colon Cancer? The New York Times. Aug 19, 2025.Sainani KL. Introduction to survival analysis. PM&R. 2016;  8:580-85.Sainani KL. Making sense of intention-to-treat. PM&R. 2010;2:209-13.ThanksThanks to Caitlin Goodrich for the episode topic tip!Kristin and Regina’s online courses: Demystifying Data: A Modern Approach to Statistical Understanding  Clinical Trials: Design, Strategy, and Analysis Medical Statistics Certificate Program  Writing in the Sciences Epidemiology and Clinical Research Graduate Certificate Program Programs that we teach in:Epidemiology and Clinical Research Graduate Certificate Program Find us on:Kristin -  LinkedIn & Twitter/XRegina - LinkedIn & ReginaNuzzo.com(00:00) - Intro (05:42) - Two different types of cancer studies (08:12) - Why might exercise affect cancer? (10:05) - Phase III trials are different (12:40) - Who was in the CHALLENGE trial? (13:31) - Stratified randomization with minimization (15:05) - The exercise prescription (19:17) - What did the CHALLENGE trial measure? (20:04) - Disease-free survival (21:59) - Data and Safety Monitoring Board – what do they do? (24:35) - Participants and adherence to exercise (26:54) - Intention-to-treat analysis (29:58) - Survival analysis overview (31:51) - Kaplan-Meier curves (34:27) - Reality-show analogy (36:54) - Ratio statistics are confusing (39:30) - Hazard ratios (47:03) - Wrap-up, rating, and methodological morals

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app