

Data Skeptic
Kyle Polich
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Episodes
Mentioned books

11 snips
Dec 26, 2025 • 38min
Video Recommendations in Industry
Cory Zechmann, a seasoned content curator and the mind behind the music blog Silence No Good, delves into the fascinating blend of human curation and machine learning in content discovery. He discusses the cold start problem and the importance of editorial signals in algorithmic systems. Cory emphasizes the role of human curators in enhancing data and mitigating filter bubbles. He also highlights the significance of balancing familiarity with surprise, and the necessity for better metrics to improve personalization. Lastly, he shares insights on how conversational AI might redefine user preferences in the future.

13 snips
Dec 18, 2025 • 52min
Eye Tracking in Recommender Systems
In this discussion, guest Santiago De Leon Martinez, a doctoral researcher at the Kempelin Institute, dives into the innovative use of eye tracking in recommender systems. He reveals the mechanics behind gaze data, fixations, and saccades, showcasing the RecGaze dataset tailored for studying browsing patterns. Santiago highlights how eye tracking can uncover insights beyond traditional click data, addressing positional bias and user engagement. He also addresses ethical concerns and shares his vision for improving recommendation algorithms by simulating user behavior.

25 snips
Dec 8, 2025 • 40min
Cracking the Cold Start Problem
Boya Xu, an Assistant Professor of Marketing at Virginia Tech, explores the intricacies of recommender systems. She delves into hybrid approaches that combine collaborative filtering and bandit learning to tackle challenges like the cold start problem for new users. Boya emphasizes using demographic information for bootstrapping recommendations and ensuring fairness for minority users. She also discusses how recommender systems affect consumer behavior and content creation across digital platforms, shedding light on the impact of algorithms in shaping user experiences.

36 snips
Nov 23, 2025 • 37min
Designing Recommender Systems for Digital Humanities
Florian Atzenhofer-Baumgartner is a PhD student at Graz University of Technology, specializing in recommender systems for digital humanities projects like Monasterium.net. He discusses why traditional recommenders fail in complex digital archives, addressing the diverse needs of users from historians to genealogists. Florian elaborates on technical challenges such as sparse interaction matrices and multi-modal similarity approaches. The conversation also highlights the importance of balancing serendipity and utility in recommendations and the unique evaluation metrics for non-commercial domains.

29 snips
Nov 13, 2025 • 33min
DataRec Library for Reproducible in Recommend Systems
Alberto Carlo Maria Mancino, a postdoctoral researcher at Politecnico di Bari, dives into the world of recommender systems. He discusses the new DataRec Python library aimed at improving dataset reproducibility and consistency in research. Key topics include the challenges of dataset management, the significant impact of minor changes on research outcomes, and the importance of offline evaluation. Alberto highlights popular datasets like MovieLens and explains how DataRec automates processes and integrates with existing models, ultimately emphasizing the need for better reproducibility in machine learning.

30 snips
Nov 5, 2025 • 35min
Shilling Attacks on Recommender Systems
In this discussion, Aditya Chichani, a senior machine learning engineer at Walmart with a master's in data science from UC Berkeley, dives into the intriguing world of shilling attacks on recommender systems. He explains how malicious actors manipulate these systems using fake profiles to either promote items or sabotage competitors. Aditya details various attack strategies, like segmented and bandwagon attacks, revealing the alarming prevalence of fake reviews and the vulnerabilities in collaborative filtering. The conversation also highlights detection methods and the ongoing cat-and-mouse dynamics between attackers and system defenders.

17 snips
Oct 29, 2025 • 52min
Music Playlist Recommendations
Rebecca Salganik, a PhD student at the University of Rochester, combines her passion for music with cutting-edge research in recommender systems. She highlights the challenges of fairness, including popularity bias and multi-interest bias in music recommendations. Her innovative LARP framework enhances playlist continuity using both audio and textual data. By creating the Music Semantics dataset, she captures authentic music descriptions from listeners, paving the way for more personalized music experiences and improved algorithmic recommendations.

28 snips
Oct 15, 2025 • 35min
Bypassing the Popularity Bias
Václav Blahut, a machine learning researcher at Seznam.cz, dives into the intricate world of personalized news recommendations. He highlights the challenges of popularity bias, explaining how it can skew user exposure toward trending content while neglecting niche interests. Václav introduces the concept of inverse recommendation, where the focus shifts to finding the right users for less popular items. He discusses strategies to balance diversity and business metrics and emphasizes the importance of adapting user profiles through multiple embeddings for a more personalized experience.

Oct 9, 2025 • 38min
Sustainable Recommender Systems for Tourism
In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable tourism practices through innovative approaches to data acquisition and algorithm design. Key highlights include leveraging large language models for synthetic data generation, developing recommendation architectures that balance user satisfaction with environmental concerns, and creating frameworks that distribute tourism more equitably across destinations. Ashmi's insights offer valuable perspectives for both AI researchers and tourism industry professionals seeking to implement more responsible recommendation technologies.

23 snips
Sep 22, 2025 • 33min
Interpretable Real Estate Recommendations
Kunal Mukherjee, a postdoctoral research associate at Virginia Tech specializing in graph-based machine learning, discusses his innovative work on human-interpretable real estate recommendations. He highlights how the COVID-19 pandemic has transformed the real estate market, necessitating better recommendation systems. Kunal explains his graph neural network approach that not only recommends properties but also provides clear reasons for these suggestions. The conversation delves into the importance of regional context, the use of user co-click data, and the benefits of graph models over traditional methods.


