Linear Digressions
Ben Jaffe and Katie Malone
Linear Digressions is a podcast about machine learning and data science. Machine learning is being used to solve a ton of interesting problems, and to accomplish goals that were out of reach even a few short years ago.
Episodes
Mentioned books
Jun 11, 2018 • 18min
GDPR
By now, you have probably heard of GDPR, the EU's new data privacy law. It's the reason you've been getting so many emails about everyone's updated privacy policy.
In this episode, we talk about some of the potential ramifications of GRPD in the world of data science.
Jun 3, 2018 • 22min
Git for Data Scientists
If you're a data scientist, chances are good that you've heard of git, which is a system for version controlling code. Chances are also good that you're not quite as up on git as you want to be--git has a strong following among software engineers but, in our anecdotal experience, data scientists are less likely to know how to use this powerful tool. Never fear: in this episode we'll talk through some of the basics, and what does (and doesn't) translate from version control for regular software to version control for data science software.
May 20, 2018 • 20min
Analytics Maturity
Data science and analytics are hot topics in business these days, but for a lot of folks looking to bring data into their organization, it can be hard to know where to start and what it looks like when they're succeeding. That was the motivation for writing a whitepaper on the analytics maturity of an organization, and that's what we're talking about today. In particular, we break it down into five attributes of an organization that contribute (or not) to their success in analytics, and what each of those mean and why they matter.
Whitepaper here:
bit.ly/analyticsmaturity
May 13, 2018 • 19min
SHAP: Shapley Values in Machine Learning
Shapley values in machine learning are an interesting and useful enough innovation that we figured hey, why not do a two-parter? Our last episode focused on explaining what Shapley values are: they define a way of assigning credit for outcomes across several contributors, originally to understand how impactful different actors are in building coalitions (hence the game theory background) but now they're being cross-purposed for quantifying feature importance in machine learning models. This episode centers on the computational details that allow Shapley values to be approximated quickly, and a new package called SHAP that makes all this innovation accessible.
May 7, 2018 • 27min
Game Theory for Model Interpretability: Shapley Values
As machine learning models get into the hands of more and more users, there's an increasing expectation that black box isn't good enough: users want to understand why the model made a given prediction, not just what the prediction itself is. This is motivating a lot of work into feature important and model interpretability tools, and one of the most exciting new ones is based on Shapley Values from game theory. In this episode, we'll explain what Shapley Values are and how they make a cool approach to feature importance for machine learning.
Apr 30, 2018 • 15min
AutoML
If you were a machine learning researcher or data scientist ten years ago, you might have spent a lot of time implementing individual algorithms like decision trees and neural networks by hand. If you were doing that work five years ago, the algorithms were probably already implemented in popular open-source libraries like scikit-learn, but you still might have spent a lot of time trying different algorithms and tuning hyperparameters to improve performance. If you're doing that work today, scikit-learn and similar libraries don't just have the algorithms nicely implemented--they have tools to help with experimentation and hyperparameter tuning too. Automated machine learning is here, and it's pretty cool.
Apr 23, 2018 • 13min
CPUs, GPUs, TPUs: Hardware for Deep Learning
A huge part of the ascent of deep learning in the last few years is related to advances in computer hardware that makes it possible to do the computational heavy lifting required to build models with thousands or even millions of tunable parameters. This week we'll pretend to be electrical engineers and talk about how modern machine learning is enabled by hardware.
Apr 16, 2018 • 31min
A Technical Introduction to Capsule Networks
Last episode we talked conceptually about capsule networks, the latest and greatest computer vision innovation to come out of Geoff Hinton's lab. This week we're getting a little more into the technical details, for those of you ready to have your mind stretched.
Apr 9, 2018 • 14min
A Conceptual Introduction to Capsule Networks
Convolutional nets are great for image classification... if this were 2016. But it's 2018 and Canada's greatest neural networker Geoff Hinton has some new ideas, namely capsule networks. Capsule nets are a completely new type of neural net architecture designed to do image classification on far fewer training cases than convolutional nets, and they're posting results that are competitive with much more mature technologies.
In this episode, we'll give a light conceptual introduction to capsule nets and get geared up for a future episode that will do a deeper technical dive.
Apr 2, 2018 • 22min
Convolutional Neural Nets
If you've done image recognition or computer vision tasks with a neural network, you've probably used a convolutional neural net. This episode is all about the architecture and implementation details of convolutional networks, and the tricks that make them so good at image tasks.


