

Data Skeptic
Kyle Polich
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Episodes
Mentioned books

Jun 21, 2019 • 23min
Facebook Bargaining Bots Invented a Language
In 2017, Facebook published a paper called Deal or No Deal? End-to-End Learning for Negotiation Dialogues. In this research, the reinforcement learning agents developed a mechanism of communication (which could be called a language) that made them able to optimize their scores in the negotiation game. Many media sources reported this as if it were a first step towards Skynet taking over. In this episode, Kyle discusses bargaining agents and the actual results of this research.

Jun 15, 2019 • 17min
Under Resourced Languages
Priyanka Biswas joins us in this episode to discuss natural language processing for languages that do not have as many resources as those that are more commonly studied such as English. Successful NLP projects benefit from the availability of like large corpora, well-annotated corpora, software libraries, and pre-trained models. For languages that researchers have not paid as much attention to, these tools are not always available.

Jun 8, 2019 • 17min
Named Entity Recognition
Kyle and Linh Da discuss the class of approaches called "Named Entity Recognition" or NER. NER algorithms take any string as input and return a list of "entities" - specific facts and agents in the text along with a classification of the type (e.g. person, date, place).

Jun 1, 2019 • 20min
The Death of a Language
USC students from the CAIS++ student organization have created a variety of novel projects under the mission statement of "artificial intelligence for social good". In this episode, Kyle interviews Zane and Leena about the Endangered Languages Project.

May 25, 2019 • 25min
Neural Turing Machines
Kyle and Linh Da discuss the concepts behind the neural Turing machine.

May 18, 2019 • 30min
Data Infrastructure in the Cloud
Kyle chats with Rohan Kumar about hyperscale, data at the edge, and a variety of other trends in data engineering in the cloud.

May 11, 2019 • 24min
NCAA Predictions on Spark
In this episode, Kyle interviews Laura Edell at MS Build 2019. The conversation covers a number of topics, notably her NCAA Final 4 prediction model.

May 3, 2019 • 15min
The Transformer
Kyle and Linhda discuss attention and the transformer - an encoder/decoder architecture that extends the basic ideas of vector embeddings like word2vec into a more contextual use case.

Apr 26, 2019 • 25min
Mapping Dialects with Twitter Data
When users on Twitter post with geographic tags, it creates the opportunity for a variety of interesting questions to be posed having to do with language, dialects, and location. In this episode, Kyle interviews Bruno Gonçalves about his work studying language in this way.

Apr 20, 2019 • 27min
Sentiment Analysis
This is an interview with Ellen Loeshelle, Director of Product Management at Clarabridge. We primarily discuss sentiment analysis.