Linear Digressions

Ben Jaffe and Katie Malone

Linear Digressions is a podcast about machine learning and data science. Machine learning is being used to solve a ton of interesting problems, and to accomplish goals that were out of reach even a few short years ago.

Episodes

Mentioned books

Apr 1, 2019 • 23min

Interview with Alex Radovic, particle physicist turned machine learning researcher

You’d be hard-pressed to find a field with bigger, richer, and more scientifically valuable data than particle physics. Years before “data scientist” was even a term, particle physicists were inventing technologies like the world wide web and cloud computing grids to help them distribute and analyze the datasets required to make particle physics discoveries. Somewhat counterintuitively, though, deep learning has only really debuted in particle physics in the last few years, although it’s making up for lost time with many exciting new advances. This episode of Linear Digressions is a little different from most, as we’ll be interviewing a guest, one of my (Katie’s) friends from particle physics, Alex Radovic. Alex and his colleagues have been at the forefront of machine learning in physics over the last few years, and his perspective on the strengths and shortcomings of those two fields together is a fascinating one.

Feb 17, 2019 • 16min

K Nearest Neighbors

K Nearest Neighbors is an algorithm with secrets. On one hand, the algorithm itself is as straightforward as possible: find the labeled points nearest the point that you need to predict, and make a prediction that’s the average of their answers. On the other hand, what does “nearest” mean when you’re dealing with complex data? How do you decide whether a man and a woman of the same age are “nearer” to each other than two women several years apart? What if you convert all your monetary columns from dollars to cents, your distances from miles to nanometers, your weights from pounds to kilograms? Can your definition of “nearest” hold up under these types of transformations? We’re discussing all this, and more, in this week’s episode.

Feb 11, 2019 • 18min

Not every deep learning paper is great. Is that a problem?

Deep learning is a field that’s growing quickly. That’s good! There are lots of new deep learning papers put out every day. That’s good too… right? What if not every paper out there is particularly good? What even makes a paper good in the first place? It’s an interesting thing to think about, and debate, since there’s no clean-cut answer and there are worthwhile arguments both ways. Wherever you find yourself coming down in the debate, though, you’ll appreciate the good papers that much more. Relevant links: https://blog.piekniewski.info/2018/07/14/autopsy-dl-paper/ https://www.reddit.com/r/MachineLearning/comments/90n40l/dautopsy_of_a_deep_learning_paper_quite_brutal/ https://www.reddit.com/r/MachineLearning/comments/agiatj/d_google_ai_refuses_to_share_dataset_fields_for_a/

Feb 3, 2019 • 25min

The Assumptions of Ordinary Least Squares

Ordinary least squares (OLS) is often used synonymously with linear regression. If you’re a data scientist, machine learner, or statistician, you bump into it daily. If you haven’t had the opportunity to build up your understanding from the foundations, though, listen up: there are a number of assumptions underlying OLS that you should know and love. They’re interesting, force you to think about data and statistics, and help you know when you’re out of “good” OLS territory and into places where you could run into trouble.

Jan 28, 2019 • 22min

Quantile Regression

Linear regression is a great tool if you want to make predictions about the mean value that an outcome will have given certain values for the inputs. But what if you want to predict the median? Or the 10th percentile? Or the 90th percentile. You need quantile regression, which is similar to ordinary least squares regression in some ways but with some really interesting twists that make it unique. This week, we’ll go over the concept of quantile regression, and also a bit about how it works and when you might use it. Relevant links: https://www.aeaweb.org/articles?id=10.1257/jep.15.4.143 https://eng.uber.com/analyzing-experiment-outcomes/

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Linear Digressions

Episodes

Mentioned books

Statistical Significance in Hypothesis Testing

The Language Model Too Dangerous to Release

The cathedral and the bazaar