

Data Skeptic
Kyle Polich
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Episodes
Mentioned books

11 snips
Dec 23, 2023 • 24min
I LLM and You Can Too
The podcast explores the utilization of large language models in daily life and work processes. It discusses the challenges and risks of using them as a service, the concept of retrieval augmented generation, and the use of embeddings and LLMs in text analysis and product development. The podcast also delves into the applications of text embeddings in similarity, search, and classification tasks, while addressing their limitations and potential risks.

Dec 19, 2023 • 40min
Q&A with Kyle
In this Q&A episode, the host discusses finding guests algorithmically, exploring impactful technologies and tools, data annotation as remote work, Cue Basic programming language, programming experiences and hacker culture, 'grab' command line utility and the importance of Git for source control.

24 snips
Dec 12, 2023 • 29min
LLMs for Data Analysis
Amir Netz, Technical Fellow at Microsoft and CTO of Microsoft Fabric, discusses how business intelligence has evolved, Power BI and Fabric, building and deploying ML models, benefits of Fabric's auto-integration and auto-optimization, Copilot capabilities, and future developments.

5 snips
Dec 4, 2023 • 34min
AI Platforms
Eric Boyd, Corporate Vice President of AI at Microsoft, shares how organizations can leverage AI for faster development. He discusses the benefits of using natural language to build products and the future of version control. Eric mentions some foundational models in Azure AI and their capabilities.

Nov 27, 2023 • 35min
Deploying LLMs
Joining us on this episode are Aaron Reich, CTO at Avanade, and Priyanka Shah, MVP for Microsoft AI. They discuss implementing generative AI for productivity gain, AI model evolution, hardware changes, designing new products and services, current state of AI strategy, and building a custom co-pilot.

5 snips
Nov 20, 2023 • 26min
A Survey Assessing Github Copilot
Jenny Liang, a PhD student at Carnegie Mellon University, discusses her recent survey on the usability of AI programming assistants. She shares some questions and takeaways from the survey, as well as the major reasons developers don't want to use code-generation tools. Concerns about intellectual property and the access code-generation tools have to in-house code are discussed.

Nov 13, 2023 • 32min
Program Aided Language Models
PhD students Aman Madaan and Shuyan Zhou discuss their paper on Program-Aided Language Models. They talk about the evolution and performance of LLMs on arithmetic tasks. Aman introduces PAL and its improvement on arithmetic tasks. Shuyan explains how PAL's performance was evaluated and the limitations of LLMs. They discuss the potential impact of PAL on math education and future research steps.

Nov 6, 2023 • 40min
Which Programming Language is ChatGPT Best At
Alessio Buscemi, software engineer at Lifeware SA, discusses the impact of ChatGPT on software engineers and the efficiency of code generation. He presents a comparative study on code generation across 10 programming languages using ChatGPT 3.5, highlighting unexpected results. The performance of different programming languages is analyzed, with discussions on language popularity and implications on industry practices. Alessio also shares insights on current projects, including sentiment analysis and investigating plagiarism.

12 snips
Oct 31, 2023 • 31min
GraphText
Jianan Zhao, a computer science student, joins to discuss using graphs with LLMs efficiently. They explore graph inductive bias, graph machine learning, limitations of natural language models for graphs, graph text as a preprocessing step, information loss in translation process, and comparison with graph neural networks.

6 snips
Oct 23, 2023 • 28min
arXiv Publication Patterns
Rajiv Movva, a PhD student in Computer Science at Cornell Tech University, discusses the findings of his research on arXiv publication patterns for LLMs. He shares insights on the increase in LLMs research and proportions of papers published by universities, organizations, and industry leaders. He highlights the focus on the social impact of LLMs and explores exciting applications in education.