Towards Data Science cover image

Towards Data Science

Latest episodes

undefined
Dec 2, 2020 • 45min

60. Rob Miles - Why should I care about AI safety?

Progress in AI capabilities has consistently surprised just about everyone, including the very developers and engineers who build today’s most advanced AI systems. AI can now match or exceed human performance in everything from speech recognition to driving, and one question that’s increasingly on people’s minds is: when will AI systems be better than humans at AI research itself? The short answer, of course, is that no one knows for sure — but some have taken some educated guesses, including Nick Bostrom and Stuart Russell. One common hypothesis is that once an AI systems are better than a human at improving their own performance, we can expect at least some of them to do so. In the process, these self-improving systems would become an even more powerful system that they were previously—and therefore, even more capable of further self-improvement. With each additional self-improvement step, improvements in a system’s performance would compound. Where this all ultimately leads, no one really has a clue, but it’s safe to say that if there’s a good chance that we’re going to be creating systems that are capable of this kind of stunt, we ought to think hard about how we should be building them. This concern among many others has led to the development of the rich field of AI safety, and my guest for this episode, Robert Miles, has been involved in popularizing AI safety research for more than half a decade through two very successful YouTube channels, Robert Miles and Computerphile. He joined me on the podcast to discuss how he’s thinking about AI safety, what AI means for the course of human evolution, and what our biggest challenges will be in taming advanced AI.
undefined
Nov 25, 2020 • 44min

59. Matthew Stewart - Tiny ML and the future of on-device AI

When it comes to machine learning, we’re often led to believe that bigger is better. It’s now pretty clear that all else being equal, more data, more compute, and larger models add up to give more performance and more generalization power. And cutting edge language models have been growing at an alarming rate — by up to 10X each year. But size isn’t everything. While larger models are certainly more capable, they can’t be used in all contexts: take, for example, the case of a cell phone or a small drone, where on-device memory and processing power just isn’t enough to accommodate giant neural networks or huge amounts of data. The art of doing machine learning on small devices with significant power and memory constraints is pretty new, and it’s now known as “tiny ML”. Tiny ML unlocks an awful lot of exciting applications, but also raises a number of safety and ethical questions. And that’s why I wanted to sit down with Matthew Stewart, a Harvard PhD researcher focused on applying tiny ML to environmental monitoring. Matthew has worked with many of the world’s top tiny ML researchers, and our conversation focused on the possibilities and potential risks associated with this promising new field.
undefined
Nov 18, 2020 • 37min

58. David Duvenaud - Using generative models for explainable AI

In the early 1900s, all of our predictions were the direct product of human brains. Scientists, analysts, climatologists, mathematicians, bankers, lawyers and politicians did their best to anticipate future events, and plan accordingly. Take physics, for example, where every task we think of as part of the learning process, from data collection to cleaning to feature selection to modeling, all had to happen inside a physicist’s head. When Einstein introduced gravitational fields, what he was really doing was proposing a new feature to be added to our model of the universe. And the gravitational field equations that he put forward at the same time were an update to that very model. Einstein didn’t come up with his new model (or “theory” as physicists call it) of gravity by running model.fit() in a jupyter notebook. In fact, he never outsourced any of the computations that were needed to develop it to machines. Today, that’s somewhat unusual, and most of the predictions that the world runs on are generated in part by computers. But only in part — until we have fully general artificial intelligence, machine learning will always be a mix of two things: first, the constraints that human developers impose on their models, and second, the calculations that go into optimizing those models, which we outsource to machines. The human touch is still a necessary and ubiquitous component of every machine learning pipeline, but it’s ultimately limiting: the more of the learning pipeline that can be outsourced to machines, the more we can take advantage of computers’ ability to learn faster and from far more data than human beings. But designing algorithms that are flexible enough to do that requires serious outside-of-the-box thinking — exactly the kind of thinking that University of Toronto professor and researcher David Duvenaud specializes in. I asked David to join me for the latest episode of the podcast to talk about his research on more flexible and robust machine learning strategies.
undefined
Nov 11, 2020 • 1h 5min

57. Dylan Hadfield-Menell - Humans in the loop

Human beings are collaborating with artificial intelligences on an increasing number of high-stakes tasks. I’m not just talking about robot-assisted surgery or self-driving cars here — every day, social media apps recommend content to us that quite literally shapes our worldviews and our cultures. And very few of us even have a basic idea of how these all-important recommendations are generated. As time goes on, we’re likely going to become increasingly dependent on our machines, outsourcing more and more of our thinking to them. If we aren’t thoughtful about the way we do this, we risk creating a world that doesn’t reflect our current values or objectives. That’s why the domain of human/AI collaboration and interaction is so important — and it’s the reason I wanted to speak to Berkeley AI researcher Dylan Hadfield-Menell for this episode of the Towards Data Science podcast. Dylan’s  work is focused on designing algorithms that could allow humans and robots to collaborate more constructively, and he’s one of a small but growing cohort of AI researchers focused on the area of AI ethics and AI alignment.
undefined
Nov 4, 2020 • 56min

56. Annette Zimmermann - The ethics of AI

As AI systems have become more powerful, they’ve been deployed to tackle an increasing number of problems. Take computer vision. Less than a decade ago, one of the most advanced applications of computer vision algorithms was to classify hand-written digits on mail. And yet today, computer vision is being applied to everything from self-driving cars to facial recognition and cancer diagnostics. Practically useful AI systems have now firmly moved from “what if?” territory to “what now?” territory. And as more and more of our lives are run by algorithms, an increasing number of researchers from domains outside computer science and engineering are starting to take notice. Most notably among these are philosophers, many of  whom are concerned about the ethical implications of outsourcing our decision-making to machines whose reasoning we often can’t understand or even interpret. One of the most important voices in the world of AI ethics has been that of Dr Annette Zimmermann, a Technology & Human Rights Fellow at the Carr Center for Human Rights Policy at Harvard University, and a Lecturer in Philosophy at the University of York. Annette is has focused a lot of her work on exploring the overlap between algorithms, society and governance, and I had the chance to sit down with her to discuss her views on bias in machine learning, algorithmic fairness, and the big picture of AI ethics.
undefined
Oct 28, 2020 • 51min

55. Rohin Shah - Effective altruism, AI safety, and learning human preferences from the state of the world

If you walked into a room filled with objects that were scattered around somewhat randomly, how important or expensive would you assume those objects were? What if you walked into the same room,  and instead found those objects carefully arranged in a very specific configuration that was unlikely to happen by chance? These two scenarios hint at something important: human beings have shaped our environments in ways that reflect what we value. You might just learn more about what I value by taking a 10 minute stroll through my apartment than by spending 30 minutes talking to me as I try to put my life philosophy into words. And that’s a pretty important idea, because as it turns out, one of the most important challenges in advanced AI today is finding ways to communicate our values to machines. If our environments implicitly encode part of our value system, then we might be able to teach machines to observe it, and learn about our preferences without our having to express them explicitly. The idea of leveraging deriving human values  from the state of an human-inhabited environment was first developed in a paper co-authored by Berkeley PhD and incoming DeepMind researcher Rohin Shah. Rohin has spent the last several years working on AI safety, and publishes the widely read AI alignment newsletter — and he was kind enough to join us for this episode of the Towards Data Science podcast, where we discussed his approach to AI safety, and his thoughts on risk mitigation strategies for advanced AI systems.
undefined
Oct 15, 2020 • 54min

54. Tim Rocktäschel - Deep reinforcement learning, symbolic learning and the road to AGI

Reinforcement learning can do some pretty impressive things. It can optimize ad targeting, help run self-driving cars, and even win StarCraft games. But current RL systems are still highly task-specific. Tesla’s self-driving car algorithm can’t win at StarCraft, and DeepMind’s AlphaZero algorithm can with Go matches against grandmasters, but can’t optimize your company’s ad spend. So how do we make the leap from narrow AI systems that leverage reinforcement learning to solve specific problems, to more general systems that can orient themselves in the world? Enter Tim Rocktäschel, a Research Scientist at Facebook AI Research London and a Lecturer in the Department of Computer Science at University College London. Much of Tim’s work has been focused on ways to make RL agents learn with relatively little data, using strategies known as sample efficient learning, in the hopes of improving their ability to solve more general problems. Tim joined me for this episode of the podcast.
undefined
Oct 8, 2020 • 1h 6min

53. Edouard Harris - Emerging problems in machine learning: making AI "good"

Where do we want our technology to lead us? How are we falling short of that target? What risks might advanced AI systems pose to us in the future, and what potential do they hold? And what does it mean to build ethical, safe, interpretable, and accountable AI that’s aligned with human values? That’s what this year is going to be about for the Towards Data Science podcast. I hope you join us for that journey, which starts today with an interview with my brother Ed, who apart from being a colleague who’s worked with me as part of a small team to build the SharpestMinds data science mentorship program, is also collaborating with me on a number of AI safety, alignment and policy projects. I thought he’d be a perfect guest to kick off this new year for the podcast.
undefined
Sep 23, 2020 • 55min

52. Sanyam Bhutani - Networking like a pro in data science

Networking is the most valuable career advancement skill in data science. And yet, almost paradoxically, most data scientists don’t spend any time on it at all. In some ways, that’s not terribly surprising: data science is a pretty technical field, and technical people often prefer not to go out of their way to seek social interactions. We tend to think of networking with other “primates who code” as a distraction at best, and an anxiety-inducing nightmare at worst. So how can data scientists overcome that anxiety, and tap into the value of network-building, and develop a brand for themselves in the data science community? That’s the question that brings us to this episode of the podcast. To answer it, I spoke with repeat guest Sanyam Bhutani — a top Kaggler, host of the Chai Time Data Science Show, Machine Learning Engineer and AI Content Creator at H2O.ai, about the unorthodox networking strategies that he’s leveraged to become a fixture in the machine learning community, and to land his current role.
undefined
Sep 16, 2020 • 40min

51. Adrien Treuille and Tim Conkling - Streamlit Is All You Need

We’ve talked a lot about “full stack” data science on the podcast. To many, going full-stack is one of those long-term goals that we never get to. There are just too many algorithms and data structures and programming languages to know, and not enough time to figure out software engineering best practices around deployment and building app front-ends. Fortunately, a new wave of data science tooling is now making full-stack data science much more accessible by allowing people with no software engineering background to build data apps quickly and easily. And arguably no company has had such explosive success at building this kind of tooling than Streamlit, which is why I wanted to sit down with Streamlit founder Adrien Treuille and gamification expert Tim Conkling to talk about their journey, and the importance of building flexible, full-stack data science apps.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app