Data Skeptic cover image

Data Skeptic

Latest episodes

undefined
May 22, 2015 • 45min

Detecting Cheating in Chess

With the advent of algorithms capable of beating highly ranked chess players, the temptation to cheat has emmerged as a potential threat to the integrity of this ancient and complex game. Yet, there are aspects of computer play that are measurably different than human play. Dr. Kenneth Regan has developed a methodology for looking at a long series of modes and measuring the likelihood that the moves may have been selected by an algorithm. The full transcript of this episode is well annotated and has a wealth of excellent links to the things discussed. If you're interested in learning more about Dr. Regan, his homepage (Kenneth Regan), his page on wikispaces, and the amazon page of books by Kenneth W. Regan are all great resources.
undefined
May 15, 2015 • 10min

[MINI] z-scores

This podcast discusses z-scores and how they describe the distance of an observation from the mean. They explore the 68-95-99.7 rule, calculate z-scores for height, and discuss the likelihood of statistical results being due to chance.
undefined
May 8, 2015 • 35min

Using Data to Help Those in Crisis

This week Noelle Sio Saldana discusses her volunteer work at Crisis Text Line - a 24/7 service that connects anyone with crisis counselors. In the episode we discuss Noelle's career and how, as a participant in the Pivotal for Good program (a partnership with DataKind), she spent three months helping find insights in the messaging data collected by Crisis Text Line. These insights helped give visibility into a number of different aspects of Crisis Text Line's services. Listen to this episode to find out how! If you or someone you know is in a moment of crisis, there's someone ready to talk to you by texting the shortcode 741741.
undefined
May 1, 2015 • 35min

The Ghost in the MP3

Have you ever wondered what is lost when you compress a song into an MP3? This week's guest Ryan Maguire did more than that. He worked on software to issolate the sounds that are lost when you convert a lossless digital audio recording into a compressed MP3 file. To complete his project, Ryan worked primarily in python using the pyo library as well as the Bregman Toolkit Ryan mentioned humans having a dynamic range of hearing from 20 hz to 20,000 hz, if you'd like to hear those tones, check the previous link. If you'd like to know more about our guest Ryan Maguire you can find his website at the previous link. To follow The Ghost in the MP3 project, please checkout their Facebook page, or on the sitetheghostinthemp3.com. A PDF of Ryan's publication quality write up can be found at this link: The Ghost in the MP3 and it is definitely worth the read if you'd like to know more of the technical details.
undefined
Apr 28, 2015 • 27min

Data Fest 2015

This episode contains converage of the 2015 Data Fest hosted at UCLA.  Data Fest is an analysis competition that gives teams of students 48 hours to explore a new dataset and present novel findings.  This year, data from Edmunds.com was provided, and students competed in three categories: best recommendation, best use of external data, and best visualization.
undefined
Apr 24, 2015 • 16min

[MINI] Cornbread and Overdispersion

For our 50th episode we enduldge a bit by cooking Linhda's previously mentioned "healthy" cornbread.  This leads to a discussion of the statistical topic of overdispersion in which the variance of some distribution is larger than what one's underlying model will account for.
undefined
Apr 17, 2015 • 13min

[MINI] Natural Language Processing

This podcast explores the concepts and techniques of natural language processing, including stemming, n-grams, part of speech tagging, and the bag of words approach. It discusses the challenges and applications of training computers to understand and recognize words in sentences and emphasizes the importance of word context and sequences in extracting meaning. The limitations of the 'bag of words' approach are highlighted, and examples are given to demonstrate how word frequency counts can be used to detect similarities between books.
undefined
Apr 10, 2015 • 32min

Computer-based Personality Judgments

Guest Youyou Wu discuses the work she and her collaborators did to measure the accuracy of computer based personality judgments. Using Facebook "like" data, they found that machine learning approaches could be used to estimate user's self assessment of the "big five" personality traits: openness, agreeableness, extraversion, conscientiousness, and neuroticism. Interestingly, the computer-based assessments outperformed some of the assessments of certain groups of human beings. Listen to the episode to learn more. The original paper Computer-based personality judgements are more accurate than those made by humansappeared in the January 2015 volume of the Proceedings of the National Academy of Sciences (PNAS). For her benevolent Youyou recommends Private traits and attributes are predictable from digital records of human behavior by Michal Kosinski, David Stillwell, and Thore Graepel. It's a similar paper by her co-authors which looks at demographic traits rather than personality traits. And for her self-serving recommendation, Youyou has a link that I'm very excited about. You can visitApplyMagicSauce.com to see how this model evaluates your personality based on your Facebook like information. I'd love it if listeners participated in this research and shared your perspective on the results via The Data Skeptic Podcast Facebook page. I'm going to be posting mine there for everyone to see.
undefined
Apr 3, 2015 • 16min

[MINI] Markov Chain Monte Carlo

Explore how Markov Chain Monte Carlo (MCMC) algorithms can be used to model complex systems and track movement probability. Learn about the application of MCMC in winery popularity and understanding likelihood of visiting wineries. Discover the real-life applications of MCMC in determining probability distributions, advertising placement, and popular routes.
undefined
Mar 20, 2015 • 11min

[MINI] Markov Chains

This podcast discusses Markov Chains and their applications in various systems including stop lights, text prediction, and bowling. The hosts explore the concept of Markov Chains in daily life and technology, as well as their impact on partially observable state spaces.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode