Data Skeptic cover image

Data Skeptic

Latest episodes

undefined
Feb 28, 2022 • 22min

Customer Clustering

Have you ever wondered how you can use clustering to extract meaningful insight from a time-series single-feature data? In today’s episode, Ehsan speaks about his recent research on actionable feature extraction using clustering techniques. Want to find out more? Listen to discover the methodologies he used for his research and the commensurate results. Visit our website for extended show notes! https://clear.ml/ ClearML is an open-source MLOps solution users love to customize, helping you easily Track, Orchestrate, and Automate ML workflows at scale.
undefined
Feb 22, 2022 • 23min

k-means Image Segmentation

Linh Da joins us to explore how image segmentation can be done using k-means clustering.  Image segmentation involves dividing an image into a distinct set of segments.  One such approach is to do this purely on color, in which case, k-means clustering is a good option.  Check out our website for extended show notes and images! Thanks to our Sponsors: Visit Weights and Biases mention Data Skeptic when you request a demo! & Nomad Data  In the image below, you can see the k-means clustering segmentation results for the same image with the values of 2, 4, 6, and 8 for k.
undefined
Feb 18, 2022 • 26min

Tracking Elephant Clusters

In today’s episode, Gregory Glatzer explained his machine learning project that involved the prediction of elephant movement and settlement, in a bid to limit the activities of poachers. He used two machine learning algorithms, DBSCAN and K-Means clustering at different stages of the project. Listen to learn about why these two techniques were useful and what conclusions could be drawn. Click here to see additional show notes on our website! Thanks to our sponsor, Astrato
undefined
Feb 14, 2022 • 24min

k-means clustering

Welcome to our new season, Data Skeptic: k-means clustering.  Each week will feature an interview or discussion related to this classic algorithm, it's use cases, and analysis. This episode is an overview of the topic presented in several segments.
undefined
Feb 7, 2022 • 47min

Snowflake Essentials

Frank Bell, Snowflake Data Superhero, and SnowPro, joins us today to talk about his book “Snowflake Essentials: Getting Started with Big Data in the Cloud.”  Snowflake Essentials: Getting Started with Big Data in the Cloud by Frank Bell, Raj Chirumamilla, Bhaskar B. Joshi, Bjorn Lindstrom, Ruchi Soni, Sameer Videkar Snowflake Solutions Snoptimizer - Snowflake Cost, Security, and Performance Optimization - Coming Soon! Thanks to our Sponsors: Find Better Data Faster with Nomad Data. Visit nomad-data.com Visit Springboard and use promo code DATASKEPTIC to receive a $750 discount
undefined
Jan 31, 2022 • 35min

Explainable Climate Science

Zack Labe, a Post-Doctoral Researcher at Colorado State University, joins us today to discuss his work “Detecting Climate Signals using Explainable AI with Single Forcing Large Ensembles.” Works Mentioned “Detecting Climate Signals using Explainable AI with Single Forcing Large Ensembles” by Zachary M. Labe, Elizabeth A. Barnes Sponsored by: Astrato and BBEdit by Bare Bones Software
undefined
Jan 24, 2022 • 43min

Energy Forecasting Pipelines

Erin Boyle, the Head of Data Science at Myst AI, joins us today to talk about her work with Myst AI, a time series forecasting platform and service with the objective for positively impacting sustainability. https://docs.myst.ai/docs Visit Weights and Biases at wandb.me/dataskeptic Find Better Data Faster with Nomad Data. Visit nomad-data.com
undefined
Jan 17, 2022 • 39min

Matrix Profiles in Stumpy

Sean Law, Principle Data Scientist, R&D at a Fortune 500 Company, comes on to talk about his creation of the STUMPY Python Library. Sponsored by Hello Fresh and mParticle: Go to Hellofresh.com/dataskeptic16 for up to 16 free meals AND 3 free gifts! Visit mparticle.com to learn how teams at Postmates, NBCUniversal, Spotify, and Airbnb use mParticle’s customer data infrastructure to accelerate their customer data strategies.
undefined
Jan 14, 2022 • 25min

The Great Australian Prediction Project

Data scientists and psychics have at least one major thing in common. Both professions attempt to predict the future. In the case of a data scientist, this is done using algorithms, data, and often comes with some measure of quality such as a confidence interval or estimated accuracy. In contrast, psychics rely on their intuition or an appeal to the supernatural as the source for their predictions. Still, in the interest of empirical evidence, the quality of predictions made by psychics can be put to the test. The Great Australian Psychic Prediction Project seeks to do exactly that. It's the longest known project tracking annual predictions made by psychics, and the accuracy of those predictions in hindsight. Richard Saunders, host of The Skeptic Zone Podcast, joins us to share the results of this decadal study. Read the full report: https://www.skeptics.com.au/2021/12/09/psychic-project-full-results-released/ And follow the Skeptics Zone: https://www.skepticzone.tv/  
undefined
Jan 10, 2022 • 26min

Water Demand Forecasting

Georgia Papacharalampous, Researcher at the National Technical University of Athens, joins us today to talk about her work “Probabilistic water demand forecasting using quantile regression algorithms.” Visit Springboard and use promo code DATASKEPTIC to receive a $750 discount

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app