Linear Digressions

Ben Jaffe and Katie Malone
undefined
Jul 26, 2020 • 36min

So long, and thanks for all the fish

The hosts bid farewell in this episode after over 5 years, thanking the audience. They reflect on their podcast journey, speech patterns, challenges, and share plans for free time. A heartfelt farewell expressing gratitude and reminiscing about the podcast's evolution.
undefined
Jul 19, 2020 • 14min

A Reality Check on AI-Driven Medical Assistants

Delving into the impact of AI algorithms on medical diagnosis, the podcast discusses the challenges and benefits of using machine learning models to diagnose conditions like diabetic retinopathy and liver cancer. Despite the potential for faster diagnoses, issues like internet connectivity, image quality, and reliance on algorithms over human judgment pose significant challenges in the healthcare system.
undefined
Jul 13, 2020 • 24min

A Data Science Take on Open Policing Data

A few weeks ago, we put out a call for data scientists interested in issues of race and racism, or people studying how those topics can be studied with data science methods, should get in touch to come talk to our audience about their work. This week we’re excited to bring on Todd Hendricks, Bay Area data scientist and a volunteer who reached out to tell us about his studies with the Stanford Open Policing dataset.
undefined
Jul 6, 2020 • 30min

Procella: YouTube's super-system for analytics data storage

This is a re-release of an episode that originally ran in October 2019. If you’re trying to manage a project that serves up analytics data for a few very distinct uses, you’d be wise to consider having custom solutions for each use case that are optimized for the needs and constraints of that use cases. You also wouldn’t be YouTube, which found themselves with this problem (gigantic data needs and several very different use cases of what they needed to do with that data) and went a different way: they built one analytics data system to serve them all. Procella, the system they built, is the topic of our episode today: by deconstructing the system, we dig into the four motivating uses of this system, the complexity they had to introduce to service all four uses simultaneously, and the impressive engineering that has to go into building something that “just works.”
undefined
Jun 29, 2020 • 23min

The Data Science Open Source Ecosystem

Exploring the dynamics of open source contributions in data science, the podcast discusses the disparity in maintenance, the role of individuals and corporations, and ways companies engage in open source. It also emphasizes the importance of supporting open source software, showing appreciation to maintainers through social media.
undefined
5 snips
Jun 21, 2020 • 16min

Rock the ROC Curve

This podcast dives into the fascinating world of Receiver Operating Characteristic (ROC) curves, tracing their origins back to WWII radar technology. It discusses how human operators faced challenges in distinguishing between enemy aircraft and false alarms. The importance of true positives and biases in ROC analysis are explored, along with the complexities of identifying objects. The conversation wraps up with insights on ROC and Area Under the Curve (AUC) as vital tools for evaluating classification models, especially in imbalanced scenarios like fraud detection.
undefined
Jun 15, 2020 • 31min

Criminology and Data Science

This episode features Zach Drake, a working data scientist and PhD candidate in the Criminology, Law and Society program at George Mason University. Zach specializes in bringing data science methods to studies of criminal behavior, and got in touch after our last episode (about racially complicated recidivism algorithms). Our conversation covers a wide range of topics—common misconceptions around race and crime statistics, how methodologically-driven criminology scholars think about building crime prediction models, and how to think about policy changes when we don’t have a complete understanding of cause and effect in criminology. For the many of us currently re-thinking race and criminal justice, but wanting to be data-driven about it, this conversation with Zach is a must-listen.
undefined
Jun 7, 2020 • 32min

Racism, the criminal justice system, and data science

As protests sweep across the United States in the wake of the killing of George Floyd by a Minneapolis police officer, we take a moment to dig into one of the ways that data science perpetuates and amplifies racism in the American criminal justice system. COMPAS is an algorithm that claims to give a prediction about the likelihood of an offender to re-offend if released, based on the attributes of the individual, and guess what: it shows disparities in the predictions for black and white offenders that would nudge judges toward giving harsher sentences to black individuals. We dig into this algorithm a little more deeply, unpacking how different metrics give different pictures into the “fairness” of the predictions and what is causing its racially disparate output (to wit: race is explicitly not an input to the algorithm, and yet the algorithm gives outputs that correlate with race—what gives?) Unfortunately it’s not an open-and-shut case of a tuning parameter being off, or the wrong metric being used: instead the biases in the justice system itself are being captured in the algorithm outputs, in such a way that a self-fulfilling prophecy of harsher treatment for black defendants is all but guaranteed. Like many other things this week, this episode left us thinking about bigger, systemic issues, and why it’s proven so hard for years to fix what’s broken.
undefined
Jun 5, 2020 • 6min

An interstitial word from Ben

A message from Ben around algorithmic bias, and how our models are sometimes reflections of ourselves.
undefined
May 31, 2020 • 22min

Convolutional Neural Networks

This is a re-release of an episode that originally aired on April 1, 2018 If you've done image recognition or computer vision tasks with a neural network, you've probably used a convolutional neural net. This episode is all about the architecture and implementation details of convolutional networks, and the tricks that make them so good at image tasks.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app