Linear Digressions

Ben Jaffe and Katie Malone
undefined
Apr 10, 2017 • 41min

A Technical Deep Dive on Stanley, the First Self-Driving Car

In our follow-up episode to last week's introduction to the first self-driving car, we will be doing a technical deep dive this week and talking about the most important systems for getting a car to drive itself 140 miles across the desert.  Lidar?  You betcha!  Drive-by-wire?  Of course!  Probabilistic terrain reconstruction?  Absolutely!  All this and more this week on Linear Digressions.
undefined
Apr 3, 2017 • 13min

An Introduction to Stanley, the First Self-Driving Car

In October 2005, 23 cars lined up in the desert for a 140 mile race.  Not one of those cars had a driver.  This was the DARPA grand challenge to see if anyone could build an autonomous vehicle capable of navigating a desert route (and if so, whose car could do it the fastest); the winning car, Stanley, now sits in the Smithsonian Museum in Washington DC as arguably the world's first real self-driving car.  In this episode (part one of a two-parter), we'll revisit the DARPA grand challenge from 2005 and the rules and constraints of what it took for Stanley to win the competition.  Next week, we'll do a deep dive into Stanley's control systems and overall operation and what the key systems were that allowed Stanley to win the race.
undefined
Mar 27, 2017 • 20min

Feature Importance

Figuring out what features actually matter in a model is harder to figure out than you might first guess.  When a human makes a decision, you can just ask them--why did you do that?  But with machine learning models, not so much.  That's why we wanted to talk a bit about both regularization (again) and also other ways that you can figure out which models have the biggest impact on the predictions of your model.
undefined
Mar 20, 2017 • 24min

Space Codes!

It's hard to get information to and from Mars.  Mars is very far away, and expensive to get to, and the bandwidth for passing messages with Earth is not huge.  The messages you do pass have to traverse millions of miles, which provides ample opportunity for the message to get corrupted or scrambled.  How, then, can you encode messages so that errors can be detected and corrected?  How does the decoding process allow you to actually find and correct the errors?  In this episode, we'll talk about three pieces of the process (Reed-Solomon codes, convolutional codes, and Viterbi decoding) that allow the scientists at NASA to talk to our rovers on Mars.
undefined
Mar 13, 2017 • 16min

Finding (and Studying) Wikipedia Trolls

You may be shocked to hear this, but sometimes, people on the internet can be mean.  For some of us this is just a minor annoyance, but if you're a maintainer or contributor of a large project like Wikipedia, abusive users can be a huge problem.  Fighting the problem starts with understanding it, and understanding it starts with measuring it; the thing is, for a huge website like Wikipedia, there can be millions of edits and comments where abuse might happen, so measurement isn't a simple task.  That's where machine learning comes in: by building an "abuse classifier," and pointing it at the Wikipedia edit corpus, researchers at Jigsaw and the Wikimedia foundation are for the first time able to estimate abuse rates and curate a dataset of abusive incidents.  Then those researchers, and others, can use that dataset to study the pathologies and effects of Wikipedia trolls.
undefined
Mar 6, 2017 • 17min

A Sprint Through What's New in Neural Networks

Advances in neural networks are moving fast enough that, even though it seems like we talk about them all the time around here, it also always seems like we're barely keeping up.  So this week we have another installment in our "neural nets: they so smart!" series, talking about three topics.  And all the topics this week were listener suggestions, too!
undefined
Feb 27, 2017 • 27min

Stein's Paradox

When you're estimating something about some object that's a member of a larger group of similar objects (say, the batting average of a baseball player, who belongs to a baseball team), how should you estimate it: use measurements of the individual, or get some extra information from the group? The James-Stein estimator tells you how to combine individual and group information make predictions that, taken over the whole group, are more accurate than if you treated each individual, well, individually.
undefined
Feb 20, 2017 • 19min

Empirical Bayes

Say you're looking to use some Bayesian methods to estimate parameters of a system. You've got the normalization figured out, and the likelihood, but the prior... what should you use for a prior? Empirical Bayes has an elegant answer: look to your previous experience, and use past measurements as a starting point in your prior. Scratching your head about some of those terms, and why they matter? Lucky for you, you're standing in front of a podcast episode that unpacks all of this.
undefined
Feb 13, 2017 • 16min

Endogenous Variables and Measuring Protest Effectiveness

Have you been out protesting lately, or watching the protests, and wondered how much effect they might have on lawmakers? It's a tricky question to answer, since usually we need randomly distributed treatments (e.g. big protests) to understand causality, but there's no reason to believe that big protests are actually randomly distributed. In other words, protest size is endogenous to legislative response, and understanding cause and effect is very challenging. So, what to do? Well, at least in the case of studying Tea Party protest effectiveness, researchers have used rainfall, of all things, to understand the impact of a big protest. In other words, rainfall is the instrumental variable in this analysis that cracks the scientific case open. What does rainfall have to do with protests? Do protests actually matter? What do we mean when we talk about endogenous and instrumental variables? We wouldn't be very good podcasters if we answered all those questions here--you gotta listen to this episode to find out.
undefined
Feb 6, 2017 • 15min

Calibrated Models

Remember last week, when we were talking about how great the ROC curve is for evaluating models? How things change... This week, we're exploring calibrated risk models, because that's a kind of model that seems like it would benefit from some nice ROC analysis, but in fact the ROC AUC can steer you wrong there.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app