Data Skeptic

Kyle Polich

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Episodes

Mentioned books

Nov 13, 2025 • 33min

DataRec Library for Reproducible in Recommend Systems

Alberto Carlo Maria Mancino, a postdoctoral researcher at Politecnico di Bari, dives into the world of recommender systems. He discusses the new DataRec Python library aimed at improving dataset reproducibility and consistency in research. Key topics include the challenges of dataset management, the significant impact of minor changes on research outcomes, and the importance of offline evaluation. Alberto highlights popular datasets like MovieLens and explains how DataRec automates processes and integrates with existing models, ultimately emphasizing the need for better reproducibility in machine learning.

Nov 5, 2025 • 35min

Shilling Attacks on Recommender Systems

In this discussion, Aditya Chichani, a senior machine learning engineer at Walmart with a master's in data science from UC Berkeley, dives into the intriguing world of shilling attacks on recommender systems. He explains how malicious actors manipulate these systems using fake profiles to either promote items or sabotage competitors. Aditya details various attack strategies, like segmented and bandwagon attacks, revealing the alarming prevalence of fake reviews and the vulnerabilities in collaborative filtering. The conversation also highlights detection methods and the ongoing cat-and-mouse dynamics between attackers and system defenders.

Oct 29, 2025 • 52min

Music Playlist Recommendations

Rebecca Salganik, a PhD student at the University of Rochester, combines her passion for music with cutting-edge research in recommender systems. She highlights the challenges of fairness, including popularity bias and multi-interest bias in music recommendations. Her innovative LARP framework enhances playlist continuity using both audio and textual data. By creating the Music Semantics dataset, she captures authentic music descriptions from listeners, paving the way for more personalized music experiences and improved algorithmic recommendations.

Oct 15, 2025 • 35min

Bypassing the Popularity Bias

Václav Blahut, a machine learning researcher at Seznam.cz, dives into the intricate world of personalized news recommendations. He highlights the challenges of popularity bias, explaining how it can skew user exposure toward trending content while neglecting niche interests. Václav introduces the concept of inverse recommendation, where the focus shifts to finding the right users for less popular items. He discusses strategies to balance diversity and business metrics and emphasizes the importance of adapting user profiles through multiple embeddings for a more personalized experience.

Oct 9, 2025 • 38min

Sustainable Recommender Systems for Tourism

In this episode, we speak with Ashmi Banerjee, a doctoral candidate at the Technical University of Munich, about her pioneering research on AI-powered recommender systems in tourism. Ashmi illuminates how these systems can address exposure bias while promoting more sustainable tourism practices through innovative approaches to data acquisition and algorithm design. Key highlights include leveraging large language models for synthetic data generation, developing recommendation architectures that balance user satisfaction with environmental concerns, and creating frameworks that distribute tourism more equitably across destinations. Ashmi's insights offer valuable perspectives for both AI researchers and tourism industry professionals seeking to implement more responsible recommendation technologies.

Sep 22, 2025 • 33min

Interpretable Real Estate Recommendations

Kunal Mukherjee, a postdoctoral research associate at Virginia Tech specializing in graph-based machine learning, discusses his innovative work on human-interpretable real estate recommendations. He highlights how the COVID-19 pandemic has transformed the real estate market, necessitating better recommendation systems. Kunal explains his graph neural network approach that not only recommends properties but also provides clear reasons for these suggestions. The conversation delves into the importance of regional context, the use of user co-click data, and the benefits of graph models over traditional methods.

Sep 8, 2025 • 50min

Why Am I Seeing This?

Dimitri Ognibene, Director of the Biconnaire Club at the University of Milano Bicocca, dives into the intricate world of social media recommender systems. The conversation highlights the lack of accessible exposure data and the resulting challenges in understanding algorithmic influence. They discuss innovative solutions like the 'recommender neutral user model' to address biases and promote data privacy. Insightful reflections on user perceptions, especially among teenagers, reveal how opaque algorithms can lead to polarization and dissatisfaction, underscoring the need for transparency.

Aug 30, 2025 • 45min

Eco-aware GNN Recommenders

Antonio Purificato, a PhD student from Sapienza University of Rome and researcher at Amazon, discusses groundbreaking work in eco-aware graph neural networks. He delves into the environmental costs of traditional recommendation systems and presents innovative methods to model user-item relationships with sustainability in mind. Topics include the Code Carbon framework for monitoring energy consumption and the balance between algorithmic performance and ecological responsibility. Antonio highlights the need for eco-friendly practices in AI as the tech world strides forward.

Aug 17, 2025 • 18min

Networks and Recommender Systems

The hosts introduce an exciting new season focused on recommender systems. They dive into how network science aids in shaping recommendation algorithms, highlighting the significance of both explicit and implicit connections. Discussions on the Internet of Agents reveal the potential of AI agents in collaborative ecosystems. They also tackle link prediction and node similarity, sharing personal experiences with recommender systems and challenges like the cold start problem. Lastly, insights into influencer dynamics using PageRank add a fresh perspective on social influence.

Jul 21, 2025 • 34min

Network of Past Guests Collaborations

Discover how DIY network analysis unveils connections among past podcast guests based on their co-authorship. Dive into the intricacies of academic collaborations, exploring underrepresented voices and the application of machine learning. Unpack the challenges of multi-agent software and personal data privacy while learning about effective data visualization techniques. Gain insights into the dynamics of academic publishing and uncover metrics that reveal the hidden structures in scholarly networks, showcasing the richness of collaborative research.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner