
Towards Data Science
Note: The TDS podcast's current run has ended.
Researchers and business leaders at the forefront of the field unpack the most pressing questions around data science and AI.
Latest episodes

Oct 27, 2021 • 45min
100. Max Jaderberg - Open-ended learning at DeepMind
On the face of it, there’s no obvious limit to the reinforcement learning paradigm: you put an agent in an environment and reward it for taking good actions until it masters a task.
And by last year, RL had achieved some amazing things, including mastering Go, various Atari games, Starcraft II and so on. But the holy grail of AI isn’t to master specific games, but rather to generalize — to make agents that can perform well on new games that they haven’t been trained on before.
Fast forward to July of this year though and a team of DeepMind published a paper called “Open-Ended Learning Leads to Generally Capable Agents”, which takes a big step in the direction of general RL agents. Joining me for this episode of the podcast is one of the co-authors of that paper, Max Jaderberg. Max came into the Google ecosystem in 2014 when they acquired his computer vision company, and more recently, he started DeepMind’s open-ended learning team, which is focused on pushing machine learning further into the territory of cross-task generalization ability. I spoke to Max about open-ended learning, the path ahead for generalization and the future of AI.
---
Intro music by:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:30 Max’s background
- 6:40 Differences in procedural generations
- 12:20 The qualitative side
- 17:40 Agents’ mistakes
- 20:00 Measuring generalization
- 27:10 Environments and loss functions
- 32:50 The potential of symbolic logic
- 36:45 Two distinct learning processes
- 42:35 Forecasting research
- 45:00 Wrap-up

Oct 20, 2021 • 46min
99. Margaret Mitchell - (Practical) AI ethics
Bias gets a bad rap in machine learning. And yet, the whole point of a machine learning model is that it biases certain inputs to certain outputs — a picture of a cat to a label that says “cat”, for example. Machine learning is bias-generation.
So removing bias from AI isn’t an option. Rather, we need to think about which biases are acceptable to us, and how extreme they can be. These are questions that call for a mix of technical and philosophical insight that’s hard to find. Luckily, I’ve managed to do just that by inviting onto the podcast none other than Margaret Mitchell, a former Senior Research Scientist in Google’s Research and Machine Intelligence Group, whose work has been focused on practical AI ethics. And by practical, I really do mean the nuts and bolts of how AI ethics can be baked into real systems, and navigating the complex moral issues that come up when the AI rubber meets the road.
***
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
- 0:00 Intro
- 1:20 Margaret’s background
- 8:30 Meta learning and ethics
- 10:15 Margaret’s day-to-day
- 13:00 Sources of ethical problems within AI
- 18:00 Aggregated and disaggregated scores
- 24:02 How much bias will be acceptable?
- 29:30 What biases does the AI ethics community hold?
- 35:00 The overlap of these fields
- 40:30 The political aspect
- 45:25 Wrap-up

Oct 13, 2021 • 49min
98. Mike Tung - Are knowledge graphs AI’s next big thing?
As impressive as they are, language models like GPT-3 and BERT all have the same problem: they’re trained on reams of internet data to imitate human writing. And human writing is often wrong, biased, or both, which means language models are trying to emulate an imperfect target.
Language models often babble, or make up answers to questions they don’t understand. And it can make them unreliable sources of truth. Which is why there’s been increased interest in alternative ways to retrieve information from large datasets — approaches that include knowledge graphs.
Knowledge graphs encode entities like people, places and objects into nodes, which are then connected to other entities via edges, which specify the nature of the relationship between the two. For example, a knowledge graph might contain a node for Mark Zuckerberg, linked to another node for Facebook, via an edge that indicates that Zuck is Facebook’s CEO. Both of these nodes might in turn be connected to dozens, or even thousands of others, depending on the scale of the graph.
Knowledge graphs are an exciting path ahead for AI capabilities, and the world’s largest knowledge graphs are trained by a company called Diffbot, whose CEO Mike Tung joined me for this episode of the podcast to discuss where knowledge graphs can improve on more standard techniques, and why they might be a big part of the future of AI.
---
Intro music by:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
0:00 Intro
1:30 The Diffbot dynamic
3:40 Knowledge graphs
7:50 Crawling the internet
17:15 What makes this time special?
24:40 Relation to neural networks
29:30 Failure modes
33:40 Sense of competition
39:00 Knowledge graphs for discovery
45:00 Consensus to find truth
48:15 Wrap-up

Oct 6, 2021 • 50min
97. Anthony Habayeb - The present and future of AI regulation
Corporate governance of AI doesn’t sound like a sexy topic, but it’s rapidly becoming one of the most important challenges for big companies that rely on machine learning models to deliver value for their customers. More and more, they’re expected to develop and implement governance strategies to reduce the incidence of bias, and increase the transparency of their AI systems and development processes. Those expectations have historically come from consumers, but governments are starting impose hard requirements, too.
So for today’s episode, I spoke to Anthony Habayeb, founder and CEO of Monitaur, a startup focused on helping businesses anticipate and comply with new and upcoming AI regulations and governance requirements. Anthony’s been watching the world of AI regulation very closely over the last several years, and was kind enough to share his insights on the current state of play and future direction of the field.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
- 0:00 Intro
- 1:45 Anthony’s background
- 6:20 Philosophies surrounding regulation
- 14:50 The role of governments
- 17:30 Understanding fairness
- 25:35 AI’s PR problem
- 35:20 Governments’ regulation
- 42:25 Useful techniques for data science teams
- 46:10 Future of AI governance
- 49:20 Wrap-up

22 snips
Sep 29, 2021 • 1h 5min
96. Jan Leike - AI alignment at OpenAI
The more powerful our AIs become, the more we’ll have to ensure that they’re doing exactly what we want. If we don’t, we risk building AIs that use dangerously creative solutions that have side-effects that could be undesirable, or downright dangerous. Even a slight misalignment between the motives of a sufficiently advanced AI and human values could be hazardous.
That’s why leading AI labs like OpenAI are already investing significant resources into AI alignment research. Understanding that research is important if you want to understand where advanced AI systems might be headed, and what challenges we might encounter as AI capabilities continue to grow — and that’s what this episode of the podcast is all about. My guest today is Jan Leike, head of AI alignment at OpenAI, and an alumnus of DeepMind and the Future of Humanity Institute. As someone who works directly with some of the world’s largest AI systems (including OpenAI’s GPT-3) Jan has a unique and interesting perspective to offer both on the current challenges facing alignment researchers, and the most promising future directions the field might take.
---
Intro music:
➞ Artist: Ron Gelinas
➞ Track Title: Daybreak Chill Blend (original mix)
➞ Link to Track: https://youtu.be/d8Y2sKIgFWc
---
Chapters:
0:00 Intro
1:35 Jan’s background
7:10 Timing of scalable solutions
16:30 Recursive reward modeling
24:30 Amplification of misalignment
31:00 Community focus
32:55 Wireheading
41:30 Arguments against the democratization of AIs
49:30 Differences between capabilities and alignment
51:15 Research to focus on
1:01:45 Formalizing an understanding of personal experience
1:04:04 OpenAI hiring
1:05:02 Wrap-up

Sep 22, 2021 • 47min
95. Francesca Rossi - Thinking, fast and slow: AI edition
The recent success of large transformer models in AI raises new questions about the limits of current strategies: can we expect deep learning, reinforcement learning and other prosaic AI techniques to get us all the way to humanlike systems with general reasoning abilities?
Some think so, and others disagree. One dissenting voice belongs to Francesca Rossi, a former professor of computer science, and now AI Ethics Global Leader at IBM. Much of Francesca’s research is focused on deriving insights from human cognition that might help AI systems generalize better. Francesca joined me for this episode of the podcast to discuss her research, her thinking, and her thinking about thinking.

Jul 28, 2021 • 1h 3min
94. Divya Siddarth - Are we thinking about AI wrong?
AI research is often framed as a kind of human-versus-machine rivalry that will inevitably lead to the defeat — and even wholesale replacement of — human beings by artificial superintelligences that have their own sense of agency, and their own goals.
Divya Siddarth disagrees with this framing. Instead, she argues, this perspective leads us to focus on applications of AI that are neither as profitable as they could be, nor safe enough to prevent us from potentially catastrophic consequences of dangerous AI systems in the long run. And she ought to know: Divya is an associate political economist and social technologist in the Office of the CTO at Microsoft.
She’s also spent a lot of time thinking about what governments can — and are — doing to shift the framing of AI away from centralized systems that compete directly with humans, and toward a more cooperative model, which would see AI as a kind of facilitation tool that gets leveraged by human networks. Divya points to Taiwan as an experiment in digital democracy that’s doing just that.

Jul 21, 2021 • 44min
93. 2021: A year in AI (so far) - Reviewing the biggest AI stories of 2021 with our friends at the Let’s Talk AI podcast
2020 was an incredible year for AI. We saw powerful hints of the potential of large language models for the first time thanks to OpenAI’s GPT-3, DeepMind used AI to solve one of the greatest open problems in molecular biology, and Boston Dynamics demonstrated their ability to blend AI and robotics in dramatic fashion.
Progress in AI is accelerating exponentially, and though we’re just over halfway through 2021, this year is already turning into another one for the books. So we decided to partner with our friends over at Let’s Talk AI, a podcast co-hosted by Stanford PhD and former Googler Sharon Zhou, and Stanford PhD student Andrey Kurenkov, that covers current events in AI.
This was a fun chat, and a format we’ll definitely be playing with more in the future :)

Jul 14, 2021 • 1h 6min
92. Daniel Filan - Peering into neural nets for AI safety
Many AI researchers think it’s going to be hard to design AI systems that continue to remain safe as AI capabilities increase. We’ve seen already on the podcast that the field of AI alignment has emerged to tackle this problem, but a related effort is also being directed at a separate dimension of the safety problem: AI interpretability.
Our ability to interpret how AI systems process information and make decisions will likely become an important factor in assuring the reliability of AIs in the future. And my guest for this episode of the podcast has focused his research on exactly that topic. Daniel Filan is an AI safety researcher at Berkeley, where he’s supervised by AI pioneer Stuart Russell. Daniel also runs AXRP, a podcast dedicated to technical AI alignment research.

Jul 7, 2021 • 1h 1min
91. Peter Gao - Self-driving cars: Past, present and future
Cruise is a self-driving car startup founded in 2013 — at a time when most people thought of self-driving cars as the stuff of science fiction. And yet, just three years later, the company was acquired by GM for over a billion dollars, having shown itself to be a genuine player in the race to make autonomous driving a reality. Along the way, the company has had to navigate and adapt to a rapidly changing technological landscape, mixing and matching old ideas from robotics and software engineering with cutting edge techniques like deep learning.
My guest for this episode of the podcast was one of Cruise’s earliest employees. Peter Gao is a machine learning specialist with deep experience in the self-driving car industry, and is also the co-founder of Aquarium Learning, a Y Combinator-backed startup that specializes in improving the performance of machine learning models by fixing problems with the data they’re trained on. We discussed Peter’s experiences in the self-driving car industry, including the innovations that have spun out of self-driving car tech, as well as some of the technical and ethical challenges that need to be overcome to make self-driving cars hit mainstream use around the world.